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Abstract 

This paper is devoted to a deeper understanding of the heat flow and to the refinement 
of calculus tools on metric measure spaces (X, d,m). Our main results are: 

• A general study of the relations between the Hopf-Lax semigroup and Haniilton- 
Jacobi equation in metric spaces (A, d). 

• The equivalence of the heat fiow in (A, m) generated by a suitable Dirichlet energy 
and the Wasserstein gradient fiow of the relative entropy functional Entm in the space 
of probability measures ^{X). 

• The proof of density in energy of Lipschitz functions in the Sobolev space W^'^{X, d, m) 
under the only assumption that m is locally finite. 

• A fine and very general analysis of the differentiability properties of a large class of 
Kantorovich potentials, in connection with the optimal transport problem. 

Our results apply in particular to spaces satisfying Ricci curvature bounds in the sense of 
Lott & Villani [28] and Sturm [35, 36] and require neither the doubling property nor the 
validity of the local Poincare inequality. 
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1 Introduction 

Aim of this paper is to provide a deeper understanding of analysis in metric measure spaces, 
with a particular focus on the properties of the heat flow. Our main results, whose validity 
does not depend on doubling and Poincare assumptions, are: 

(i) The proof that the Hopf-Lax formula produces sub-solutions of the Hamilton-Jacobi 
equation on general metric spaces (X, d), and solutions if (^, d) is a length space. 

(ii) The proof of equivalence of the heat flow in L^(X, m) generated by a suitable Dirichlet 
energy and the Wasserstein gradient flow in ^^{X) of the relative entropy functional 
Entm w.r.t. m. 

(iii) The proof that Lipschitz functions arc always dense in energy in the Sobolev space 
W^'"^. This is achieved by showing the equivalence of two weak notions of modulus 
of the gradient: the flrst one (inspired by Cheeger [10], see also [21], [19], and the 
recent review [20]), that we call relaxed gradient, is defined by (X, m)-relaxation of 
the pointwise Lipschitz constant in the class of Lipschitz functions; the second one 
(inspired by Shanmugalingam [34]), that we call weak upper gradient, is based on the 
validity of the fundamental theorem of calculus along almost all curves. These two 
notions of gradient will be compared and identified, assuming only m to be locally finite. 
We might consider the former gradient as a "vertical" derivative, related to variations 
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in the dependent variable, while the latter is an "horizontal" derivative, related to 
variations in the independent variable. 

(iv) A fine and very general analysis of the differentiability properties of a large class of 
Kantorovich potentials, in connection with the optimal transport problem. 

Our results apply in particular to spaces satisfying Ricci curvature bounds in the sense of 
Lott & Villani [28] and Sturm [35, 36], that we call in this introduction LSV spaces. Indeed, 
the development of a "calculus" in this class of spaces has been one of our motivations. In 
particular we are able to prove the following result (see Theorem 9.3 for a more precise and 
general statement): if {X,d,m) is a CD{K, oo) space and m G ^{X), then 

(a) For every fi = fm G I^{X) the Wasserstein slope |V~Entna|^(//) of the relative entropy 
Entm coincides with the Fisher information functional /{p>o} l^/^l*//^*^^' where \Vp\* 
is the relaxed gradient of p (see the brief discussion before (1.3)). 

(b) For every fiQ = foxa G D(Entni) n ^2{X) there exists a unique gradient flow jUj = ftxa of 
Entnv starting from /iq in {^2{X),W2), and if /o G L'^{X,m) the functions ft coincide 
with the L^(X, m) gradient flow of Cheeger's energy Ch*, defined by (see also (1.3) for 

an equivalent definition) 

Ch,(/) := i inf jliminf J \Vfh\'' dm : A G Lip(X), [A - dm ^ o| . (1.1) 

On the other hand, we believe that the "calculus" results described in (iii) are of a wider 
interest for analysis in metric measure spaces, beyond the application to LSV spaces. Par- 
ticularly important is not only the identification of heat flows, but also the identification of 
weak gradients that was previously known only under doubling and Poincare asssumptions. 
The key new idea is to use the heat flow and the rate of energy dissipation, instead of the 
usual covering arguments, to prove the optimal approximation by Lipschitz functions, see also 
Remark 4.6 and Remark 5.10 for a detailed comparison with the previous approaches. 

In connection with (ii), notice that the equivalence so far has been proved in Euclidean 
spaces by Jordan-Kinderleher-Otto, in the seminal paper [22], in Riemannian manifolds by 
Erbar and Villani [13, 38], in Hilbert spaces by [5], in Finsler spaces by Ohta-Sturm [29] and 
eventually in Alexandrov spaces by Gigli-Kuwada-Ohta [17]. In fact, the strategy pursued in 
[17], that we shall describe later on, had a great influence on our work. The distinguished 
case when the gradient flows are linear will be the object, in connection with LSV spaces, of 
a detailed investigation in [3]. 

We exploit as much as possible the variational formulation of gradient flows on one hand 
(based on sharp energy dissipation rate and the notion of descending slope) and the variational 
structure of the optimal transportation problem to develop a theory that does not rely on 
finite dimensionality and doubling properties; we are even able to cover situations where the 
distance d is allowed to take the value +oo, as it happens for instance in optimal transportation 
problems in Wiener spaces (see for instance [15, 14]). We are also able to deal with a- 
finite measures m, provided they are representable in the form e^ m with m{X) < 1 and 
y : X — 7- [0, oo) d-Lipschitz weight function bounded from above on compact sets. 

In order to reach this level of generality, it is useful to separate the roles of the topology 
r of X (used for the measure-theoretic structure) and of the possibly extended distance d 
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involved in the optimal transport problem, introducing the concept of Polish extended metric 
measure space {X,d,T,m). Of course, the case when d is a distance inducing the Polish 
topology T is included. Since we assume neither doubling properties nor the validity of the 
Poincare inequalities, we can't rely on Cheeger's theory [10], developed precisely under these 
assumptions. The only known connection between synthetic curvature bounds and this set 
of assumptions is given in [27], where the authors prove that in non-branching LSV spaces 
the Poincare inequality holds under the so-called CD{K, N) assumption (N < oo), a stronger 
curvature assumption which involves also the dimension. 

Now we pass to a more detailed description of the results of the paper, the problems and the 
related literature. In Section 2 we introduce all the basic concepts used in the paper: first we 
define extended metric spaces (X, d), Polish extended spaces {X, d, r) (in our axiomatization d 
and r are not completely decoupled, see (iii) and (iv) in Definition 2.3), absolutely continuous 
curves, metric derivative \xt\, local Lipschitz constant |V/|, one-sided slopes |V^/|. Then, we 
sec how in Polish extended spaces one can naturally state the optimal transport problem with 
cost c = d^ either in terms of transport plans (i.e. probability measures in X x X) or, when 
the space is geodesic, in terms of geodesic transport plans, namely probability measures, with 
prescribed marginals at t = 0, t = 1, in the space Geo(X) of constant speed geodesies in X. 
In Subsection 2.5 we recall the basic definition of gradient flow (y^) of an energy functional 
E: it is based on the integral formulation of the sharp energy dissipation rate 

which, under suitable additional assumptions (for instance the fact that jV^E'l is an upper 
gradient of E, as it happens for K-geodesically convex functionals) , turns into an equality for 
almost every time. These facts will play a fundamental role in our analysis. 
In Section 3 we study the fine properties of the Hopf-Lax semigroup 

Qtfix) := inf f{y) + {x, t) e X X (0, oo) (1.2) 

in a extended metric space (X, d). Here the main technical novelty, with respect to [26], is 
the fact that we do not rely on Cheeger's theory (in fact, no reference measure m appears 
here) to show in Proposition 3.6 that in length spaces (x, t) i— )■ Qtf{x) is a pointwise solution 
to the Hamilton- Jacobi equation dtQtf + |VQt|^/2 = 0: precisely, for given x, the equation 
does not hold for at most countably many times t. This is achieved refining the estimates in 
[4, Lemma 3.1.2] and looking at the monotonicity properties w.r.t. t of the quantities 

D~^{x,t) := suplimsupd(x, y„), D^{x,t) := inf liminf d(x, y„) 

n— ^-oo n— >oo 

where the supremum and the infimum run among all minimizing sequences (y„) in (1.2). 
Although only the easier subsolution property dtQtf + |V(5(p/2 < (which does not involve 
the length condition) will play a crucial role in the results of Sections 6 and 8, another 
byproduct of this refined analysis is a characterization of the slope of Qtf (see Proposition 3.6) 
which applies, to some extent, also to Kantorovich potentials (see 10). 

In Section 4 we follow quite closely [10], defining the collection of relaxed gradients of / as 
the weak limits of |V/„|, where /„ are d-Lipschitz and /„ — > / in L'^{X,m) (the differences 
with respect to [10] are detailed in Remark 4.6). The collection of all these weak limits is a 
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convex closed set in L^(X, m), whose minimal element is called relaxed gradient, and denoted 
by |V/|*. One can then sec that Cheeger's convex and lower semicontinuous functional (1-1) 
can be equivalently represented as 



Ch*(/) = ^^|V/|2dm 



(1.3) 



(set to +00 if / has no relaxed gradient) and get a canonical gradient flow in L^(X, m) 
of Ch^, and a notion of Laplacian Aj^m associated to Ch^,. As explained in Remark 4.12 
and Remark 4.14, this construction can be trivial if no other assumption on (X, d,T, m) is 
imposed, and in any case Ch* is not necessarily a quadratic form and the Laplacian, though 
1-homogeneous, is not necessarily linear. Precisely because of this potential nonlinearity we 
avoided the terminology "Dirichlet form" , usually associated to quadratic forms, in connection 
with Ch*. 

It is also possible to consider the one-sided slopes |V^/|, getting one-sided relaxed gradients 
|V^/|* and Cheeger's corresponding functionals Ch^; eventually, but this fact is not trivial, 
we prove that the one-sided relaxed functionals coincide with Ch*, see Remark 6.4. 

Section 5 is devoted to the "horizontal" notion of modulus of gradient, that we call weak 
upper gradient, along the lines of [34]: roughly speaking, we say that G is a weak upper 
gradient of / if the inequality |/(7o) — /(7i)| ^ j^G holds along "almost all" curves with re- 
spect to a suitable collection T of probability measures concentrated on absolutely continuous 
curves, see Definition 5.4 for the precise statement. The class of weak upper gradients has 
good stability properties that allow to define a minimal weak upper gradient, that we shall 
denote by \^ f\w,7^ ^ind to prove that iV/l^^g- < |V/|* m-a.e. in X for all / G iD(Ch*) if 7 is 
concentrated on the class of all the absolutely continuous curves with finite 2-energy. 

Section 6 is devoted to prove the converse inequality and therefore to show that in fact the 
two notions of gradient coincide. The proof relies on the fine analysis of the rate of dissipation 
of the entropy J"j^ ht log ht dm along the gradient flow of Ch^, and on the representation of htm 
as the time marginal of a random curve. The fact that htm (having a priori only Lp'{X,m) 
regularity in time and Sobolev regularity in space) can be viewed as an absolutely continuous 
curve with values in (^(X), VF2) is a consequence of Lemma 6.1, inspired by [17, Proposi- 
tion 3.7]. More precisely, the metric derivative of 1 1->- htm w.r.t. the Wasserstein distance can 
be estimated as follows: 



The latter estimate, written in an integral form, follows by a delicate approximation proce- 
dure, the Kantorovich duality formula and the fine properties of the Hopf-Lax semigroup we 
proved. 

In Section 7 we introduce the relative entropy functional and the Fisher information 



and prove two crucial inequalities for the descending slope of Entm: the first one, still based 
on Lemma 6.1, provides the lower bound via the Fisher information 




for a.e. t G (0, 00). 



(1.4) 




F(p) = 4 / IVVpI* dm < |V-Entn^|2(/,) if = pm. 



(1.5) 



Jx 
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and the second one, combining [38, Theorem 20.1] with an approximation argument, the 
upper bound when p is d-Lipschitz (and satisfies further technical assumptions if Tn(X) = oo) 

|V~Entn^|2(//) < 4 / |V"VpP dm if = pm. (1.6) 
Jx 

The identification of the squared descending slope of Entm (which is always a convex func- 
tional, as we show in §7.3) with the Fisher information thus follows, whenever |V~Entni| 
satisfies a lower semicontinuity property, as in the case of LSV spaces. 

In Section 8 we show how the uniqueness proof written by the second author in [16] for 
the case of finite reference measures can be adapted, thanks to the tightness properties of the 
relative entropy, to our more general framework: wc prove uniqueness of the gradient flow 
of Entn, first for flows with uniformly bounded densities and then, assuming that jV^Entntj 
is an upper gradient, without any restriction on the densities. In this way we obtain the 
key property that the Wasserstein gradient flow of Entm, understood in the metric sense of 
Subsection 2.5, has a unique solution for a given initial condition with finite entropy. This 
uniqueness phenomenon should be compared with the recent work [30] , where it it shown that 
in LSV spaces (precisely in Finsler spaces) contractivity of the Wasserstein distance along 
the semigroup may fail. 

In Section 8.3 we prove the equivalence of the two gradient fiows, in the natural class 
where a comparison is possible, namely nonnegative initial conditions fo ^ L} r\ L?'{X,m) (if 
vci{X) = oo we impose also that J"^ foV'^ dm < oo). In the proof of this result, that requires 
suitable assumptions on |V~Entm|, we follow the new strategy introduced in [17]: while the 
traditional approach aims at showing that the Wasserstein gradient flow p,t = ftxn solves a 
"conventional" PDE, here we show the converse, namely that the gradient flow of Cheeger's 
energy provides solutions to the Wasserstein gradient flow. Then, uniqueness (and existence) 
at the more general level of Wasserstein gradient flow provides equivalence of the two gradient 
flows. The key properties to prove the validity of the sharp dissipation rate 

-^Ent^(Am) > ^Ihmf + ^|V-Ent„^(/tm)|2, 

where ft is the gradient flow of Ch*, are the slope estimate (1.6) and the metric derivative 
estimate (1.4). 

We also emphasize that some results of ours, as the uniqueness provided in Theorem 8.1 

for flows with bounded densities, or the full convergence as the time step tends to of the 
Jordan-Kinderleher-Otto scheme in Corollary 8.2, require no assumption on the space (except 
for an exponential volume growth condition) and jV^Entn^j, so that they are applicable even 
to spaces which are known to be not LSV or for which the lower semicontinuity of |V Entmj 
fails or it is unknown, as Carnot groups endowed with the Carnot-Caratheodory distance and 
the Haar measure. 

In Section 9 we show, still following to a large extent [16], the crucial lower semicontinuity 
of I V~Entni| in LSV spaces; this shows that all existence and uniqueness results of Section 8.3 
are applicable to LSV spaces and that the correspondence between the heat flows is complete. 

The paper ends, in the last section, with results that are important for the development 
of a "calculus" with Kantorovich potentials. They will play a key role in some proofs of [3]. 
We included these results here because their validity does not really depend on curvature 
properties, but rather on their implications, namely the existence of geodesic interpolations 
satisfying suitable L°° bounds. 
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Under these assumptions, in Theorem 10.3 we prove that the ascending slope IV"*"!/?! is the 
minimal weak upper gradient for Kantorovich potentials A nice byproduct of this proof is a 
"metric" Brcnier theorem, namely the fact that the transport distance d(x, y) coincides for 7- 
a.e. {x,y) with |V"''(^|(x) even when the transport plan 7 is multi-valued. In addition, IV"*"!^! 
coincides m-a.e. with the relaxed and weak upper gradients. To some extent, the situation 
here is "dual" to the one appearing in the transport problem with cost=Euclidean distance: 
in that situation, one knows the direction of transport, without knowing the distance. In 
addition, we obtain in Theorem 10.4 a kind of differentiability property of ip along transport 
geodesies. 

Eventually, we want to highlight an important application to the present paper to the 
theory of Ricci bounds from below for metric measure spaces. It is well known [29] that LSV 
spaces, while stable under Gromov-Hausdorff convergence and consistent with the smooth 
Riemannian case, include also Finsler geometries. It is therefore natural to look for addi- 
tional axioms, still stable and consistent, that rule out these geometries, thus getting a finer 
description of Gromov-Hausdorff limits of Riemannian manifolds. In [3] we prove, relying 
in particular on the results obtained in Section 6, Section 9 and Section 10 of this paper, 
that LSV spaces whose associated heat flow is linear have this stability property. In ad- 
dition, we show that LSV bounds and linearity of the heat flow are equivalent to a single 
condition, namely the existence of solutions to the Wasscrstcin gradient flow of Ent^x in the 
EVI sense, implying nice contraction and regularization properties of the flow; we call these 
Riemannian lower bounds on Ricci curvature. Finally, for this stronger notion we provide 
good tcnsorization and localization properties. 

Acknowledgement. The authors acknowledge the support of the ERG ADG GeMeThNES. 

2 Preliminary notions 

In this section we introduce the basic metric, topological and measure-theoretic concepts used 
in the paper. 

2.1 Extended metric and Polish spaces 

In this paper we consider metric spaces whose distance function may attain the value 00, we 
call them extended metric spaces. 

Definition 2.1 (Extended distance and extended metric spaces) An extended distance 
on X is a map d : — > [0, 00] satisfying 

d{x, y) = if and only if x = y, 

d{x,y) = d{y,x) \fx,yeX, 

d{x,y) < d{x,z) + d{z,y) ^x, y, z £ X. 

If d is an extended distance on X, we call {X,d) an extended metric space. 

Most of the definitions concerning metric spaces generalize verbatim to extended metric 
spaces, since extended metric spaces can be written as a disjoint union of metric spaces, which 
are simply defined as 

:={y€X:d(y,a;)<oo}, x e X. (2.1) 
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For instance it makes perfectly sense to speak about a complete or length extended metric 
space. 

Definition 2.2 (d-Lipschitz functions and Lipschitz constant) We say that f : X ^ 
R is d-Lipschitz if there exists C > satisfying 

\f{x)-fiy)\<Cd{x,y) yx,yeX. 

The least constant C with this property will he denoted by Lip(/). 

In our framework the roles of the distance d (used to define optimal transport) and of the 
topology are distinct. This justifies the following definition. Recall that a topological space 
{X, t) is said to be Polish if r is induced by a complete and separable distance. 

Definition 2.3 (Polish extended spaces) We say that {X, r, d) is a Polish extended space 
if- 

(i) T is a topology on X and {X,t) is Polish; 

(a) d is an extended distance on X and d) is a complete extended metric space; 
(Hi) For (xfi) G X , X E X , 6{xfi,x) — >■ implies x^ x w.r.t. to the topology t; 
(iv) d is lower semicontinuous in X x X, with respect to the t x t topology. 

In the sequel, when d is not explicitly mentioned, all the topological notions (in particular 
the class of compact sets, the class of Borel sets ^{X), the class C(,(X) of bounded con- 
tinuous functions and the class ^{X) of Borel probability measures) are always referred to 
the topology r, even when d is a distance. When {X, d) is separable (thus any d-open set is 
a countable union of d-closed balls, which arc also r-closed by (iv)), then a subset of X is 
d-Borel if and only if it is r-Borel, but when {X, d) is not separable ^{X) can be a strictly 
smaller class than the Borel sets generated by d. 

The Polish condition on r guarantees that all Borel probability measures fi G ,'^{X) are 
tight, a property (shared with the more general class of Radon spaces, see e.g. [4, Def. 5.1.4]) 
which justifies the introduction of the weaker topology r. In fact most of the results of the 
present paper could be extended to Radon spaces, thus including Lusin and Suslin topologies 
[33]. 

Notice that the only compatibility conditions between the possibly extended distance d and 
T are (iii) and (iv). Condition (iii) guarantees that convergence in {^{X),W2), as defined 
in Section 2.4, implies weak convergence, namely convergence in the duality with Ci,{X). 
Condition (iv) enables us, when the cost function c equals d^, to use the standard results of 
the Kantorovich theory (existence of optimal plans, duality, etc.) and other useful properties, 
as the lower semicontinuity of the length and the p-energy of a curve w.r.t. pointwise r- 
convergence, or the representation results of [23] . 

An example where the roles of the distance and the topology are different is provided 
by bounded closed subsets of the dual of a separable Banach space: in this case d is the 
distance induced by the dual norm and r is the weak* topology. In this case r enjoys better 
compactness properties than d. 

The typical example of Polish extended space is a separable Banach space {X, || • ||) endowed 
with a Gaussian probability measure 7. In this case r is the topology induced by the norm 
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and d is the Cameron-Martin extended distance induced by 7 (see [6]): thus, differently from 
{X,t), {X, d) is not separable if dimX = 00. 

It will be technically convenient to use also the class of universally measurable 

sets (and the associated universally measurable functions): it is the cr-algebra of sets which 
are jU-measurable for any fi G ^{X). 



2.2 Absolutely continuous curves and slopes 

If {X, d) is an extended metric space, J C M is an open interval, p G [1, 00] and 7 : J — >■ X, 
we say that 7 belongs to AC^( J; {X, d)) if 

ft 



dhs,lt) < 9{r) dr Vs, t G J, s < t 



for some g G L^{J). The case p = 1 corresponds to absolutely continuous curves, whose space 
is simply denoted by AC (J; (X, d)). It turns out that, if 7 belongs to AC^(J; (X, d)), there is 
a minimal function g with this property, called metric derivative and given for a.e. i G J by 

|7*h=lim'^^-^') 



s^t \s — t\ 

See [4, Theorem 1.1.2] for the simple proof. We say that an absolutely continuous curve 7^ 
has constant speed if |7t| is (equivalent to) a constant. 

Notice that, by the completeness of (X,d), AC*'(J; (X, d)) C C(J;X), the set of r- 
continuous curves 7 : J — > X. For t G J we define the evaluation map e^ : C{J;X) X 

by 

etil) ■■= It- 

We endow C( J; X) with the sup extended distance 

d*(7,7) := supd(7t,7t) 
teJ 

and with the compact-open topology r*, whose fundamental system of neighborhoods is 

{7 G C( J; X) : jiKi) cUi, i = 1,2, . . . ,n} , Ki C J compact, Ui e t, n>l. 

With these choices, it can be shown that (C( J; X), r*, d*) inherits a Polish extended structure 
from {X,T,d). Also, with this topology it is clear that the evaluation maps are continuous 
from {C{J;X),T*) to {X,t). Since for p > 1 the p-energy 

Ep[y] := J IjlP dt if 7 G AC^'(J;(X,d)), £^[7] := 00 otherwise, (2.2) 

is T*-lower-semicontinuous thanks to (iv) of Definition 2.3, AC^( J; {X, d)) is a Borel subset of 
C(J; X). It is not difficult to check that AC(J; {X, d)) is a Borel set as well; indeed, denoting 
J = (a, b) and defining 

TV(7, (a,s)) := sup|^d(7t;_^i,7tj : n G N, a < io < • • • < in < s| s e {a,b]. 



9 



it can be immediately seen that TV (7, (a, s)) is lower scmicontinuous in 7 and nonincreasing 
in s. Also, a continuous 7 is absolutely continuous iff the Stieltjes measure associated to 
TV (7, (a,-)) is absolutely continuous w.r.t. by an integration by parts, this can be 

characterized in terms of m£{j) | as e | 0, where 



< e 



rriei'y) := sup I f TV(7, (a, s))'0'(s) ds : -0 € (a, 6), max l^'l < 1, / |V'(s)|ds 

wo Ja , 

(2.3) 

if TV (7, (a, 5)) is finite, m,g{'y) = +oc otherwise. Since arc Borcl in C{J;X), thanks to 
the separability of C^{a,b) w.r.t. the norm, the Borcl regularity of AC(J; {X,d)) follows. 

We call {X,d) a geodesic space if for any xq, xi & X with d(xo,a;i) < 00 there exists a 
curve 7 : [0, 1] — >■ X satisfying 70 = a^Oj 7i = xi and 

d(7„7t) = |t-s|d(7o,7i) Vs, iG[0,l]. (2.4) 

We will denote by Geo(X) the space of all constant speed geodesies 7 : [0, 1] — )■ X, namely 
7 G Geo(X) if (2.4) holds. Given / : X ^ R we define its effective domain D{f) by 

Dif) -{xeX: fix) e M} . (2.5) 

Given / : X — >■ M and x G D{f), we define the local Lipschitz constant at x by 

|V/|(.):=hmsup^^^-^. 

We shall also need the one-sided counterparts of the local Lipschitz constant, called respec- 
tively descending slope and ascending slope: 

IV /|(x) := hmsup , |VVl(x) := limsup • (2-6) 

When x G D{f) is an isolated point of X, we set |V/|(x) = |V-/|(x) = |V+/|(x) := 0, 
while all slopes are conventionally set to +00 on X \ D{f). 
Notice that for all x G D{f) it holds 

|V/|(x) = max{|V-/|(x), |V+/|(x)|}, |V-/|(a;) = |V+(-/)|(x). (2.7) 

Also, for f, g : X ^ R it is not difficult to check that 

\Viaf + (3g)\<\a\\Vf\ + \(3\\Vg\, Va,/3GM (2.8a) 
|V(/5)|<|/||V5| + M|V/| (2.8b) 

on D{f) r\D{g). Also, if X : X ^ [0, 1], it holds 

|V±(X/ + (1 - X)g)\ < X|V±/| + (1 - X)|V±5l + |VX| |/ - ^l- (2.9) 

Indeed, adding the identities 

X{y)f{y) - X{x)f{x) = X{y){f{y) - f{x)) + f{x){x{y) - X{x)), 
X{y)g{y) - X{x)g{x) = x{y){g{y) - g{x)) + g{x){x{y) - X{x)) 
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with X = 1 — X one obtains 

{Xf + xg){y) - {xf + xg){x) _ ^,,^, J{y) - f{x) - 

d(z/,x) -""^y^ A{y,x) d(y,x) 

X(y) - X(.t) 

from which the inequahty readily follows by taking the positive or negative parts and letting 
y ^ X. We shall also need the measurability of slopes, ensured by the following lemma. 

Lemma 2.4 If f : X ^ R is Borel, then its slopes |V^/| (and therefore \V f\) are ^*{X)- 
measurable in D{ f). In particular, if ^ : [0,1] X is a continuous curve with jt £ D{ f) 
for a.e. t G [0,1], then the (^^-almost everywhere defined) functions |V^/| 07 are Lebesgue 
measurable. 

Proof. By (2.7) it is sufficient to consider the case of the ascending slope and, since the 
functions 

^ , , ifjy) - /(^))+ 

Gr{x) := sup — ^ 

{y. 0<d{x,y)<r} ^[yyX) 

(with the convention sup0 = 0, so that Gr{x) = for r small enough if x is an isolated point) 
monotonically converge to [V"*"/! on D{f), it is sufficient to prove that is universally 
measurable for any r > 0. For any r > and a > we see that the set 

{x G D{f) : Gr{x) > a} 

is the projection on the first factor of the Borel set 

{(x, y) e D{f) X X : f{y) - f{x) > ad{x, y), 0<d{x,y)<r}, 

so it is a Suslin set (see [7, Proposition 1.10.8]) and therefore it is universally measurable (see 
[7, Theorem 1.10.5]). 

To check the last statement of the lemma it is sufficient to recall [12, Remark 32 (c2)] that 
a continuous curve 7 is (^*([0, 1]), =^*(X)) measurable, since any set in ^*{X) is measurable 
for all images of measures /x G ^{[0, 1]) under 7. □ 

Finally, for completeness we include the simple proof of the fact that |V~/| = [V"*"/! m- 
a.e. if d is finite, / is d-Lipschitz and {X, d,m) is a doubling metric measure space. We will be 
able to prove a weaker version of this result even in non-doubling situations, see Remark 6.4. 

Proposition 2.5 // d is finite, and (X, d,m) is doubling, for all d-Lipschitz f : X ^ R, 
|V-/| = |V+/| m-a.e. inX. 

Proof. Let a' > a > and consider the set H := {|V~/| < a}. Let Hm be the subset of 
points X such that f{x) — f{y) < a'd[x, y) for all y satisfying d(a:, y) < 1/m. By the doubling 
property, the equality H = Umllm ensures that m-a.e. x £ H is a point of density 1 for some 
set Hm. If we fix x with this property and d(x„, x) — t- 0, we can estimate 

f{xn) - f{x) = f{xn) - f{yn) + f {yn) - f {x) < Lip(/)d(a;„, yn) + a'd(y„, x) 

choosing y„ G Hm D B-^i^{x). But, thanks to the doubling property, since the density of H^n 
at X is 1 we can choose yn in such a way that d{xmyn) = o{d{xn,x)). Indeed, if for some 
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6 > the ball Bsd{xn,x)i^n) does not intersect Hm for infinitely many n, the upper density 
of X \ H„i in the balls i?d{x„,x) (•^) is strictly positive. Dividing both sides by d(x„,x) the 
arbitrariness of the sequence yields |V"'"/|(a;) < a' . 

Since a and a' are arbitrary we conclude that IV"*"/! < |V~/| m-a.e. in X. The proof of 
the converse inequality is similar. □ 



2.3 Upper gradients 

According to [10], we say that a function g : X [0, oo] is an upper gradient of / : X — >■ M 

if, for any curve 7 G AC((0, 1); d)), s 5(7s)|7s| is measurable in [0,1] (with the 
convention • 00 = 0) and 

/ f < fg, (2.11) 

Here and in the following we write Jg^ f for /(71) - /(70) and J^g = Jq 5(7s)l7sl ds. 

It is not difficult to see that if / is a Borel and d-Lipschitz function then the two slopes 
and the local Lipschitz constant are upper gradients. More generally, the following remark 
will be useful. 

Remark 2.6 (When slopes are upper gradients along a curve) Notice that if one a 
priori knows that t 1— /(7t) is absolutely continuous along a given absolutely continuous 
curve 7 : [0, 1] — > D{f ), then |V^/| are upper gradients of / along 7. Indeed, |V^(/o7)| are 
bounded from above by |V^/| o 7I7I wherever the metric derivative I7I exists; then, one uses 
the fact that at any differentiability point both slopes of / o 7 coincide with |(/ o 7)'|. ■ 

The next lemma is a refinement of [4, Lemma 1.2.6]; as usual, we adopt the convention 
• 00 = 0. 

Lemma 2.7 (Absolute continuity criterion) Let L G L^{0, 1) be nonnegative and let g : 

[0, 1] —7- [0, 00] be a measurable map with Ldt > and g{t)L{t) dt < 00. Let w : [0, 1] — >■ 
M U {—00} be an upper semicontinuous map, with w > —00 a.e. on {L ^ 0}, satisfying 



w{s) - w{t) < g{t) 



dr 



for all te{w> -00} (2.12) 



and, for arbitrary < a < 6 < 1, 

Ldt = ^> w is constant in [a,b]. (2-13) 



/ 

Ja 



Then {w = —00} is empty and w is absolutely continuous in [0, 1]. 

Proof. It is not restrictive to assume L(t) dt = 1 and set A := L^^|[qj]. We introduce the 
monotone, right continuous map t : [0, 1] [0, 1] pushing ^ onto A: setting 

x(t) := I L{r) dr = A([0,t]) it holds t(x) := sup{t G [0, 1] : x(t) < x}, 

Jo 

and considering the function ^ := o t we easily get 

/•t(2/) fb rx{b) 

/ L{r) dr = \x - y\ 0<x<y<l, / g{t)L{t)dt= j g{z)dz a, 6 G [0,1], (2.14) 

Jt{x) J a Jx(a) 
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so that, defining also w := wot, (2.12) becomes 

w{y) — w(x) < g{x)\x — y\ for all x £ {w > — oo}. (2-15) 

Notice that w is still upper semicontinuous: since it is the composition of an upper semi- 
continuous function with the increasing right continuous map t, we have just to check this 
property at the jump set of t. If x G (0, 1] satisfies t_(a;) = liiOy^x^iy) < ^{x)^ since w is 
constant in [t_(x),t(x)] we have 

limsuptZ;(y) = limsupti;(s) < w{t-{x)) = w{t{x)) = w{x). 

In particular w is bounded from above and choosing yo such that w{yo) > — oo we get 
w{x) > w{yo) — g{x) for every x E {w > — oo}, so that w is integrable. Since 

\w{y) — ■w{x)\ < {g{x) + g{y))\x — y\ for every x,y G (0, 1) \ {tZ; > — oo} 

applying [4, Lemma 2.1.6] we obtain that w € VF^'-'^(0, 1) and \w'\ < 2g a.e. in (0, 1). 

Since w G VF^'^(0, 1) there exists a continuous representative w of w in the Lebesgue 
equivalence class of w. Since any point in [0, 1] can be approximated by points in the coinci- 
dence set we obtain that w > w > — oo in [0, 1]. We can apply (2.15) to obtain (in the case 

y<i) 

I ry+h ry+h 
w{y) — — I w{x)dx< I ^(x)da;— >-0 as /i 4 0. 

h Jy Jy 

Since Jy^^ w)(x) da; ~ hw{y) as h \.0, we obtain the opposite inequality w{y) < w{y) for every 
y G [0, 1). In the case y = I the argument is similar. 

We thus obtain \w{x) — w{y)\ < 2 g{z) dr for every < x < y < 1. Now, the fact that 
w is constant in any closed interval where x is constant ensures the validity of the identity 
w{s) = w{t{x{s)), so that w{s) = w{x{s)) and the second equality in (2.14) yields 

/•x(t) ft 

\w{s)-w{t)\ = \w{x{s))-w{x{t))\<2 g{z)dz = 2 g{r)L{r)dr < s < t < I. 

Jx{s) Js 

□ 



Corollary 2.8 Let 7 G AC([0, 1]; (X, d)) and let ip : X ^ RU {-00} be a d-upper semi- 
continuous map such that ip{'ys) > —00 a.e. in [0,1]. Let g : X ^ [0, 00] be such that 
50717! G L^{0,1) and 



Vils) - fht) < gilt) 



dr 



for all t such that (fdt) > —00. 



(2.16) 



Then the map s i->- (^(75) is real valued and absolutely continuous. 



The proof is an immediate application of Lemma 2.7 with L := I7I and it; := o 7; (2.13) is 
true since 7 (and thus (foj) is constant on every interval where I7I vanishes a.e. 
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2.4 The space {^{X),W2) 

Here we assume that (X, r, d) is a Polish extended space. Given n, v E ^{X), we define the 
Wasserstein distance W2 between them as 

Wi{ii,v):=ini I d2(x,y)d7(x,y), (2.17) 
JXxX 

where the infimum is taken among all 7 G ^{X x X) such that 

Such measures are called admissible plans (or couplings) for the pair (fJ,,!^)- As usual, if 
/X G ^{X) and T : X — >■ y is a /x- measurable map with values in the topological space Y, 
the push-forward measure Tj/x G ^{Y) is defined by T^fi{B) := ijl{T~^{B)) for every set 
B G SS{Y). 

We are not restricting ourselves to the space of measures with finite second moments, so 
that it can possibly happen that W2(a*, i^) = 00. Still, via standard arguments one can prove 
that W2 is an extended distance in ^{X). Also, we point out that if we define 

^[^](X):={i.G^(X) : W2{^l.v)<^] 

for some G ^(X), then the space (i^[^](X), W2) is actually a complete metric space (which 
reduces to the standard one {i3^2{X), W2) if /i is a Dirac mass and d is finite). 

Concerning the relation between W2 convergence and weak convergence, the implication 

W2iHn,fJ')^0 =^ f ^dfin^ f <pdn y^eCbiX) (2.18) 

Jx Jx 

is well known if (X, d) is a metric space and r is induced by the distance d, see for instance [4, 
Proposition 7.1.5]; the implication remains true in our setting, with the same proof, thanks 
to the compatibility condition (iii) of Definition 2.3. 

Since d^ is r- lower semicontinuous, when W2{fJ.,i^) < 00 the infimum in the definition 
(2.17) of W2 is attained and we call optimal all the plans 7 realizing the minimum; Kan- 
torovich's duality formula holds: 

^Wiiii, u) = sup <p dfi + J^^du: ip{x) + ilj{y)<^d\x,y)Y (2.19) 

where the functions (f and ^ in the supremum are respectively //-measurable and i^-measurable, 
and in L^. One can also restrict, without affecting the value of the supremum, to bounded 
and continuous functions f, ijj (see [4, Theorem 6.1.1]). 

Recall that the c-transform (^'^ of (/3 : X ^ M U {—00} is defined by 

V.^(y):=inf|^!^-^(x): x G x| 

and that is said to be c-concave ip = ip'^ for some ip. 

c-concave functions are always d-upper semicontinuous, hence Borel in the case when d is 
finite and induces r. More generally, it is not difficult to check that 

ip Borel ip" ^*(X)-measurable. (2.20) 
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The proof follows, as in Lemma 2.4, from Suslin's theory: indeed, the set {(p^ < a} is the 
projection on the second coordinate of the Borcl set of points (.x, y) such that c\^{x,y)/2 — 
(p{x) < a, so it is a Suslin set and therefore universally measurable. 

If ip(x), ip{y) satisfy (p{x) + ilj{y) < d^(x,y)/2, since (p*^ > 4> still satisfies if + (f'^ < d^/2 
and since we may restrict ourselves to bounded continuous functions, we obtain 

^Wiifi, iy) = sup dfi + J^^'diy: G • (2.21) 

Definition 2.9 (Kantorovich potential) Assume that d is a finite distance. We say that 
a map ip : X 'RU {— oo} is a Kantorovich potential relative to an optimal plan 7 if: 

(i) is c-concave, not identically equal to —00 and Borel; 

(ii) ip{x) + ip^iy) = ^d^{x,y) for'y-a.e. {x,y) eX xX. 

Since (p is not identically equal to —00 the function ip'^ still takes values in RU {— 00} and the 
c-concavity of p ensures that ip = {p^Y- Notice that we are not requiring integrability of (p 
and (p^, although condition (ii) forces p (resp. p>^) to be finite /U-a.e. (resp. z/-a.e.). 

The existence of maximizing pairs in the duality formula can be a difficult task if d is 
unbounded, and no general result is known when d may attain the value 00. For this reason 
we restrict ourselves to finite distances d in the previous definition and in the next proposition, 
concerning the main existence and integrability result for Kantorovich potentials. 

Proposition 2.10 (Existence of Kantorovich potentials) If d is finite and f is an op- 
timal plan with finite cost, then Kantorovich potentials p relative to 7 exist. In addition, if 
d(x, y) < a{x) + b{y) with a G L'^iX, jj,) and b G L'^{Y, v), the functions ip, p^ are respectively 
H-integrable and v-integrable and provide maximizers in the duality formula (2.19). In this 
case ip is a Kantorovich potential relative to any optimal plan 7. 

Proof. Existence of (p follows by a well-known argument, see for instance [4, Theorem 6.1.4]: 

one makes the Riischendorf-Rockafellar construction of a c-concave function (p starting from 
a (T-compact and d^-monotone set T on which 7 is concentrated. The last statement follows 

by 

^1^1(^,1^)= / pdf,+ I p'dv< [ Id^dj 
^ JX JX JXxX ^ 

for any admissible plan 7. □ 



2.5 Geodesically convex functionals and gradient flows 

Given an extended metric space (y, dy) (in the sequel it will mostly be a Wasserstein space) 
and G M, a functional E : Y ^ {+C!o} is said to be i^-geodesically convex if for any 
yo, yi € D{E) with dy (yo, yi) < 00 there exists 7 G Geo(y) such that 70 = yo, 71 = yi and 

E{jt) < (1 - t)E{yo) + tE{yi) - ^t{l - t)dUyo, yi) Vi G [0, 1]. 

A consequence of iC-geodesic convexity is that the descending slope defined in (2.6) can 
be calculated as 

iv-£i(i,)= sup (^■^^' 

zeY\{y}\ dy(y,2:) 2 J 
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so that |V E\{y) is the smallest constant S >0 such that 

E{z)>E{y)-SdY{z,y) + ^dUz,y) iov eyeiy z e Y^yy (2.23) 

We recall (see [4, Corollary 2.4.10]) that for K-geodesically convex functionals the descending 
slope is a upper gradient, as defined in Section 2.3: in particular 

E{yt) > E{ys) - j \V-E\{yr)dr for every s,t G [0,oo), s < t (2.24) 

J s 

for all locally absolutely continuous curves y : [0,oo) D{E). A metric gradient flow for 
£^ is a locally absolutely continuous curve y : [0, oo) — >■ D{E) along which (2.24) holds as an 
equality and moreover \yt\ = \SJ~ E\{yt) for a.e. t G (0,oo). 

An application of Young inequality shows that gradient flows for functionals can be char- 
acterized by the following definition. 

Definition 2.11 (£'-dissipation inequality and metric gradient flow^) Let E : Y ^ 

M U {+C!o} be a functional. We say that a locally absolutely continuous curve [0, oo) 9 i >->• 

yt e D{E) 

satisfies the ^^-dissipation inequality if 

E{yo)>E{yt) + ^j\yr\^dr + ^j\v-E\\yr)dr Vt > 0. (2.25) 

y is a gradient flow of E starting from yo G D{E) if (2.25) holds as an equality, i.e. 

E{yo) = E{yt) + ^j\yrfdr + ^j\v-Ef{yr)dr Vt > 0. (2.26) 

By the remarks above, it is not hard to check that (2.26) is equivalent to the ii^-dissipation 
inequality (2.25) whenever t t-^ E(yt) is absolutely continuous, in particular if |V~i?| is an 
upper gradient of E (as for IC-geodesicaUy convex functionals). In this case (2.26) is equivalent 
to 

^S(yt) = -\ytf = -\V'Ef{yt) for a.e. t G (0, oo). (2.27) 

If £^ : — >■ M is a smooth functional, then a curve {yt) is a gradient flow according 
to the previous definition if and only if it satisfies yl = —VE{yt) for all t G (0, oo), so that 
the metric definition reduces to the classical one when specialized to Euclidean spaces and to 
regular curves and functionals. 

3 Hopf-Lax semigroup in metric spaces 

In this section we study the properties of the functions given by Hopf-Lax formula in a metric 
setting and the relations with the Hamilton-Jacobi equation. Here we only assume that 
{X, d) is an extended metric space until Theorem 3.5 (in particular, {X, d) is not necessarily 
d-complete or d-separable) and the measure structure {X, r, m) does not play a role, except 
in Proposition 3.8 and Proposition 3.9. After Theorem 3.5 we will also assume that our space 
is a length space. 
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and 



Let {X, d) be an extended metric space and / : X — >■ M U {+00}. We define 

Fit,x,y):=fiy) + ^^, (3.1) 

Qtfix) := inf F{t, x, y) {x, t) E X x (0, 00). (3.2) 
yex 

The map {x, t) i->- Qtf{x), Xx{0, 00) R, is obviously d-upper semicontinuous, The behavior 
of Qtf is not trivial only in the set 

D(/) := {x e X : d{x,y) < 00 for some y with f{y) < 00} (3.3) 

and we shall restrict our analysis to D{f), so that Qtf{x) G MU {— cjo} for {x,t) G !'(/) x 
(0,00). For x G 2)(/) we set also 

t*(x) := sup{t > : Qtf{x) > —00} 

with the convention t^:{x) = if Qtfix) = —00 for all t > 0. Since Qtf{x) > —00 implies 
Qsf{y) > —00 for all s G (0,t) and all y at a finite distance from x, it follows that t*(x) 
depends only on the equivalence class X^^j of x, see (2.1). 
Finally, we introduce the functions D'^{x,t), D~{x,t) as 

D'*'(x,f) := suplimsupd(a;,j/„), D~(x, f) := inf liminf d(x, y„), (3-4) 

where, in both cases, the [yn)'^ vary among all minimizing sequences of F[t, x, ■). It is easy to 
check (arguing as in [4, Lemma 2.2.1, Lemma 3.1.2]) that D~^{x,t) is finite for < t < t^{x) 
and that 

lim d(a;i,a;) = 0, lim = t G (0,t*(x)) lim Qt J (xi) = Qtf (x), (3.5) 

sup {£>+(y, i) : d(a;, y) < i?, < t < t*(x) - e} < 00 Vi? > 0, e > 0. (3.6) 

Simple diagonal arguments show that the supremum and the infimum in (3.4) are attained. 

Obviously D~{x, •) < D~^{x, •); the next proposition shows that both functions are non- 
increasing, and that they coincide out of a countable set. 

Proposition 3.1 (Monotonicity of D"^) For all x eD{f) it holds 

D+{x,t) < D~{x,s) < 00, 0<t<s<U{x). (3.7) 

As a consequence, D+(x, •) and D~{x, •) are both nondecreasing in (0, t*(x)) and they coincide 
at all points therein with at most countably many exceptions. 

Proof. Fix X G 2)(/), < t < s < t*(x) and choose minimizing sequences (xf) and (x^) for 
F{t,x,-) and F{s,x,-) respectively, such that limjid(x,x") = D^{x,t) and lim„d(x,x") = 
D~{x,s). As a consequence, there exist the limits of f{x^) and f{Xg) as ri 00. The 
minimality of the sequences gives 

lim/(x-) + < lim/(x^) + '''^"^"'"^^ 



n ' ^ 2t - n 2t 

lim/(x«) + ^^^4^ < + '^'^"'"'''^ 



2s - n ' ' 2s 
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Adding up and using the fact that j > j we deduce 

D+{x,t) = \imd{xf,x) < Umd(ar" = D-{x,s), 

n n 

which is (3.7). Combining this with the inequahty D~ < we immediately obtain that 
both functions are nonincreasing. At a point of right continuity of D~{x, •) we get 

D+(x,t) < miD-(x,s) = D-(x,t). 

s>t 

This implies that the two functions coincide out of a countable set. □ 

Next, we examine the semicontinuity properties of D^: they imply that points {x,t) where 
the equality D~^{x,t) = D~{x,t) occurs are continuity points for both and D~. 

Proposition 3.2 (Semicontinuity of D^) Let x„ -4 a; and tn ^ t E (0,t*(x)). Then 
D~{x,t) < liminf £)~(a;n, in), D~^{x,t) > Urn sup D~^{xn,tn)- 

n-^oo n->-oo 

In particular, for every x £ X the map t i— D~{x,t) is left continuous in (0,i*(a:)) and the 
map 1 1— 7- D~^{x,t) is right continuous in (0,t*(a;)). 

Proof. For every n G N, let (2/^)ieN be a minimizing sequence for F(tn,Xn,-) for which the 
limit of d(y^,Xn) as ? — >■ oo equals D~{xn,tn)- From (3.6) we see that we can assume that 
supj ,jd(y^,a;*) is finite. For all n we have 

Moreover, the d-upper semicontinuity of {x,t) i->- Qff[x) gives that limsup„ Qt„/(a;n) < 
Qtf{x). Since d(y^, x„) is bounded we have supj |d^(y^, Xn) — d^(yn) ^)l infinitesimal, hence 
by a diagonal argument we can find a sequence n i-> i(n) such that 

limsup/(y^)) + fiVn^^^ < g^_^(^)^ |d(xn,y;(")) - D-{xn,tn)\ < -. 

n^oo ■^t' n 

This implies that n i->- y^^"^ is a minimizing sequence for F{t,x,-), therefore 

D-{x,t) < liminf d(a;,yjf")) = liminf d(x„,y;(")) = limM D' {xn,tn). 

n—^oo n— >-oo i— >-oo 

If wc choose, instead, sequences (y'JieN on which the supremum in the definition of D~^{xn, tn) 
is attained, wc obtain the upper semicontinuity property. □ 

Before stating the next proposition we recall that semiconcave functions g on an open interval 
are local quadratic perturbations of concave functions; they inherit from concave functions 
all pointwise differentiability properties, as existence of right and left derivatives ^g > ^g, 
and similar. 

Proposition 3.3 (Time derivative of Qtf) The map (0,t*(x)) 3 t Qtf{x) is locally 
Lipschitz and locally semiconcave. For all t G (0,t*(x)) it satisfies 

In particular, s Qsf{x) is differentiahle at t & (0,i*(x)) if and only if D~^{x,t) = D~{x,t). 
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Proof. Let (xf), (x") be minimizing sequences for F{t,x, •) and F{s,x, •). We have 

Qsfix) - Qtfix) < limmiF{s,x,x^) - F{t,x,x^) = liminf ^ ' - " 7 ' C^-^) 

Qsf{x) - Qtf{x) > limsupF(5,x,x-) - = limsup ^ ' - - T • (3-10) 

n— >cx) n— ^cx) \S t J 

If s > f we obtain 

M! (i-i). <^(i-i); (3.n, 

recalling that \\in.s\,tD~{x,s) = D~^{x,t), a division hy s — t and a limit as s | i gives the 
identity for the right derivative in (3.8). A similar argument, dividing hy t—s < and passing 
to the limit as t f s yields the left derivative in (3.8). 

The local Lipschitz continuity follows by (3.11) recalling that D^{x, •) are locally bounded 
functions; we easily get the quantitative bound 

< ^\\D+{x, •)||loc(^,^,) for every < r < r' < U{x). (3.12) 

L°°(T,r') 

Since the distributional derivative of the function t i— >■ [D'^[x,t)]'^ /{2t'^) is locally bounded 
from below, we also deduce that t Qtf is locally semiconcave. □ 



^M{x) 



Proposition 3.4 (Slopes and upper gradients of Qtf) For x G 2)(/) it holds: 

teiOMx)) ^ WQtf\{x)<^^^, (3.13a) 

Qt/(x)>-oo =^ |v+QJ|(x)<:^^^. (3.13b) 

In addition, for all t G (0, f*(x)), D~{-,t)/t is an upper gradient of Qtf restricted to X^^j = 
{y ■ d(aj,y) < oo}. 

Proof. Let us first prove that for arbitrary x, y be at finite distance with Qtf{y) > —oo we 
have the estimate 

Qtfix) - Qtfhj) < dix, y)[^^^ + ^%^) . (3.14) 

It is sufficient to take a minimizing sequence (y„) for F[t,y, ■) on which the infimum in the 
definition of D~{y,t) is attained, obtaining 

Qtfix) - Qtfiy) < liminf Fit, x,yn) - Fit,y,yn) = liminf _ fi^A 

< hm^inf (d(x, yn) + d(y, y„)) < (d(x, y) + 2i5- (y, t)) . 

Dividing both sides of (3.14) by d(x, y) and taking the limsup as y — >■ a: we get (3.13a) 
for the descending slope, since Proposition 3.2 yields the upper-semicontinuity of . The 
implication (3.13b) follows by the same argument, by inverting the role of x and y in (3.14) 



19 



and still taking the limsup as y x after a division by d{x,y). The complete inequality in 
(3.13a) follows by (2.7). 

We conclude with the proof of the upper gradient property. Let t € (0, i^,(x)), let 
7 : [0, 1] — > be an absolutely continuous curve with constant speed (this is not re- 
strictive, up to a reparameterizazion) , and notice that (3.5) gives that Qtfijs) is continuous 
in [0, 1] whereas Proposition 3.2 shows the upper-semicontinuity (and thus the measurability) 
of D~{'ys,t). By applying (3.14) with x = 7^', y = 7^, if s D~{'yg,t) G L^{0, 1) we can use 
Corollary 2.8 to obtain that s h-). Qtf{js) is absolutely continuous. Coming back to (3.14) we 
obtain that |^Qt/(7s)| < D-{'js,t)/t for a.e. s e [0, 1]. □ 



Theorem 3.5 (Subsolution of HJ) For x G D{f) and t G {0,t*{x)) the right and left 
dt 



derivatives ^Qtf{x) satisfy 



-^Qtm + — 2 — - °' dt + — 2 — - °- 

In particular 

^Qtfi^) + 2 ^ ^ ^ 

with at most countably many exceptions in {0,t^{x)). 

Proof. The first claim is a direct consequence of Propositions 3.3 and 3.4. The second 
one (3.15) follows by the fact that the larger derivative, namely the left one, coincides with 

— [D~{x,t)]'^/{2t'^), and then with — [Z?"'"(x, t)]^/(2t^) with at most countably many excep- 
tions. The latter is smaller than -|V(3t/p(x)/2 by (3.13a). □ 

We just proved that in an arbitrary extended metric space the Hopf-Lax formula produces 
subsolutions of the Hamilton-Jacobi equation. Our aim now is to prove that, if {X, d) is a 
length space, then the same formula provides also supersolutions. 

We say that {X, d) is a length space if for all x, y G X the infimum of the length of 
continuous curves joining x to y is equal to d{x,y). We remark that under this assumption 
it can be proved that the Hopf-Lax formula produces a semigroup (see for instance the proof 
in [26]), while in general only the inequality Qs+tf < QsiQtf) holds. 

Proposition 3.6 (Solution of HJ and agreement of slopes) Assume that {X, d) is a length 
space. Then for all x G D{f) and t G (0,t*(x)) it holds 

\V-Qtf\ix) = \VQtf\ix) = £li^^ (3.16) 
so that equality holds in (3.13a). In particular, the right time derivative of Qtf satisfies 

^Qtf{x)+^^^^^f^ = for every teiOM^)), (3-17) 
and equality holds in (3.15), with at most countably many exceptions. 

Proof. Let (j/j) be a minimizing sequence for F{t, x, •) on which the supremum in the definition 
of D~^{x,t) is attained. Let 7* : [0,1] X he continuous curves connecting x to yi whose 
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lengths 'C(7*) converge to D'^{x,t). For every s G (0, 1) we have 

hmsup(5t/(x) - Qtfill) > limsupF(t,x,yi) - F{t,'yl,yi) 
i—>oo i-^oo 

y d^{x,y^)-d\Ys,y^) 

= nmsup — 2 

and our assumption on the 7*'s ensures that 

hm^|^ = l, hm . "^^^If . = 1 for every (0,1). 
j->oo sa{x,yi) i-^oo [I — s)a{x,yi) 

Therefore we obtain 

Qtf(x) QJirii^ ^ (d(3:.j/,)-d(V.j/0)(d(:r.j/,)+d(V.j/0) 
iimsup , -^iimsup o-u/ 5\ 

_ (2-s)L»+(x,t) 
~ 2t 

With a diagonal argument we find s(i) 4- such that 



for all s G (0,1). 



Qtf{x)-Qtf{lii)) D+(x,t) 

lim sup ■■ — - — > . 

i-^oo d(a;,7*(.p t 

Since i i->- 7*^^^ is a particular sequence converging to x we deduce 

lv-Q,/|w>q£i^). 

Thanks to (3.13a) and to the inequality |V~Qt| < |V(5t|, this proves that |V~Qt/|(a::) = 
\VQtf\{x) = D+{x,t)/t. 

Recalling (3.13b) we have |V+Qt/| < \V-Qtf\ and therefore \VQtf\ = |V-Qt/| by (2.7). 
Taking Proposition 3.3 into account wc obtain (3.17) and that the Hamilton- Jacobi equation 
is satisfied at all points x such that D^{x,t) = D~{x,t). □ 

When / is bounded the maps Qtf are easily seen to be bounded and d-Lipschitz. It is 
immediate to see that 

inf / < inf Qtf < sup Qtf < sup /. (3.18) 

XX X X 

A quantitative global estimate we shall need later on is: 



Lip(QJ) <2W^^, where osc(/) := sup / - inf /. (3.19) 
V r X X 

It can be derived noticing that choosing a minimizing sequence (yn)ngN for F{t, x, •) attaining 
the supremum in (3.4), the energy comparison 

- osc(/) < lim /(2/O + - f{x) = Qtf{x) - fix) < 

yields 



D+{x,t) < A/2tosc(/). (3.20) 
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Since D {x,t) < D~^{x,t), setting R := (\/2 — l)^/2tosc(f), (3.14) and simple calculations 
yield 

Qtfix)- Qtfiy) /osc(/)\V2 .^^ ,^ ^ ^ 

^ •' ^ ' — < 2 ^ if < d(x,y) < i?, 

d(x,y) \ t J 

and, since osc(Qt/) < osc(/) by (3.18), 



d(a;,y) 

The constant 2 in (3.19) can be reduced to -\/2 if X is a length space: it is sufficient to combine 
(3.20) with (3.13a). 

We conclude this section with a simple observation, a technical lemma, where also a Polish 
structure is involved, and with some relations between slope of Kantorovich potentials and 
Wasserstein distance. 

Remark 3.7 (Continuity of Qt at t = 0) If {X, r, d) is an extended Polish space and is 
bounded and r-lower semicontinuous, then Qtip '\ (p as 1 1 0. This is a simple consequence of 
assumption (iii) in Definition 2.3. ■ 

Proposition 3.8 Let {X,t,6) be an extended Polish space, 
(i) if K G X is compact, G C{K), M > max^" and 

r^(x) ^/xGi^, 
' \m ifxeX\K, ^ ' 

then Qt<p is r-lower semicontinuous in X for all t > 0; 

(a) if T>{f) = X , t^:{x) > T > for all x £ X and Qtp> is Borel measurable for all t > 
then ^Qtfix) is Borel measurable in X x (0,T) and the slopes 

{x,t) ^ \V+Qtip\{x), {x,t) ^ \V-Qtip\{x) 

are ^*{X x {0,T))-m,easurable in X x {0,T). 

Proof, (i) The proof is straightforward, using the identity 

QMx) = mill ?/>(?/) + ^d^{x,y). 
y&K Zt 

(ii) The Borel measurability of ^QtLp(x) is a simple consequence of the continuity of 
t I—)- Q^ip{x), together with the Borel measurability of Qt^. A simple time discretization 
argument also shows that {x,t) ^ Qt^{x) is Borel measurable. Then, the proof of the 
measurability of slopes follows as in Lemma 2.4. □ 

In the next proposition we consider the ascending slope of Kantorovich potentials, for 
finite distances d. 

Proposition 3.9 (Slope and approximation of Kantorovich potentials) Let ji, v G &^{X) 
with VF2(/x, z^) < oo and let 7 G ^{X x X) be an optimal plan with marginals ji, v. If (p is a 
Kantorovich potential relative to 7, we have 

\V~^ip\{x) < d{x,y) for-f-a.e. {x.y). (3.22) 

In particular \V^<p\ G L^(X, /x) and |V+(/9pd/U < W|(/x, z^). 
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Proof. We set / := —<f^, so that from = {ip^Y we have (p = Qif ■ In addition, the definition 
of Kantorovich potential tells us that ip{x) = f{y) + d'^{x,y)/2 for 7-a.e. {x,y), so that 

D-{x, 1) < d(x, y) for 7-a.e. {x, y). (3.23) 

Taking (3.13b) into account we obtain (3.22). □ 

In general the inequality |V"'"</?p d/Lt < l^|(Af, v) can be strict, as the following simple 
example shows: 

Example 3.10 Let X = [0, 1] endowed with the Euclidean distance, /ig = (^0 and jit = 
t-^X[o,t]-^^ for t G (0, 1]. Then clearly (//*) is a constant speed geodesic connecting ^0 to fii 
and the corresponding Kantorovich potential is (p{x) = x^/2 — x, so that / iV+y'pd/xo = 0, 
while W|(/xo,/xi) = 1/3. 



4 Relaxed gradient, Cheeger's energy, and its L^-gradient flow 

In this section we assume that {X,r,d) is a Polish extended space. Furthermore, m is a 
nonnegative, Borel and cr-finite measure on X. Recall that 

there exists a bounded Borel function 'd : X ^ (0, 00) such that / •& dm < 1. (4.1) 

Jx 

Notice that m and the finite measure m := -dm share the same class of negligible sets. In the 
following we will often assume that m and ■& satisfy some further structural conditions, which 
will be described as they occur. For future references, let us just state here our strongest 
assumption in advance: we will often assume that '& has the form e~^ , where 

V : X ^ [0, 00) is a Borel d-Lipschitz map, 

f _v2 (4-2) 
it is bounded on each compact set K C X, and / e dm < 1. 

Jx 

When r is the topology induced by the finite distance d, then the facts that V is Borel and 
bounded on compact sets arc obvious consequences of the d-Lipschitz property. In this case a 
simple choice is V{x) = ^/ K/2d{x,xo) for some xq E X and k > 0. It is not difficult to check 
that (4.2) is then equivalent to 

3 K, > : m{r) < e^^ where m(r) := m({x G X : d(x, xq) < r}). (4.3) 

In fact, for every h> 

/ e"^''^^^'^") dm = / / /ir e~ ■5''^ dr dm (x) = / hrm{r)e~^''^ dr. (4.4) 

Jx Jx Jr>d{x,xo) Jo 

Since r m(r) is nondecreasing, if the last integral in (4.4) is less than 1 for h := k, then 

1 2 

Chebichev inequality yields m(r)e~2'*^ < 1; on the other hand, if (4.4) holds, then there 
exists h> K sufficiently big such that the integral in (4.4) is less than 1, so that (4.2) holds. 
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4.1 Minimal relctxed gradient 

The content of this subsection is inspired by Cheeger's work [10]. We are going to relax the 
integral of the squared local Lipschitz constant of Lipschitz functions with respect to the 
-L^(X, m) topology. By Lemma 2.4, |V/| is ^*(X)-measurable whenever / is d-Lipschitz and 
Borel. 

Proposition 4.1 Let (X, r, d) be an extended Polish space and letm be a nonnegative, Borel 
measure in {X,t) satisfying the following condition (weaker than (4.2)j; 

\/ K d X compact 3r > : m[{x e X : d{x,K) < r}) < oo. (4.5) 

Then the class of bounded, Borel and d-Lipschitz functions f G L^(X, m) with |V/| G m) 
is dense in L^ {X, m) . 

Proof. It suffices to approximate functions tp : X such that for some compact set K C X 

G C°(K), <f = mX\K. 

By taking the positive and negative part, we can always assume that ip is, e.g., nonnegative. 
We can thus define ^ 

ipn{x) := sup [ip{y) - nd(a;, y) . 
yeK -I 

It is not difficult to check that (pn is upper semicontinuous, nonnegative, n-Lipschitz and 
bounded above by S := max^ <^ > 0; moreover 

ipnix) = \V^n\{x) = if d(x,i^) > S/n. 

If r > is given by (4.5), choosing n > S/r we get that |V(/5„| are supported in the set 
{x ^ X : 6{x,K) < r} of finite measure, so that they belong to L^(X, m); since S > ipn{x) > 
ip{x) and (pn{x) X fix) for every x E X, we conclude. □ 

Definition 4.2 (Relaxed gradients) We say that G G L^(X, m) is a relaxed gradient of 
f G L^(X, m) if there exist Borel d-Lipschitz functions fn G L^(X, m) such that: 

(o-) fn—^fin- L'^{X,m) and |V/n| weakly converge to G in L^(X, m); 

(b) G<G m-a.e. in X. 

We say that G is the minimal relaxed gradient of f if its L^ {X, m) norm is minimal among 
relaxed gradients. We shall denote by |V/|* the minimal relaxed gradient. 

The definition of minimal relaxed gradient is well posed; indeed, thanks to (2.8a) and to 
the refiexivity of L^(X,m), the collection of relaxed gradients of / is a convex set, possibly 
empty. Its closure follows by the following lemma, which also shows that it is possible to 
obtain the minimal relaxed gradient as strong limit in L^. 

Lemma 4.3 (Closure and strong approximation of the minimal relaxed gradient) 
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(a) If G G L^(X,m) is a relaxed gradient of f € L^(X, m) then there exist Borel 6-Lipschitz 
functions fn converging to f in L^(X, m) and Gn € m) strongly convergent to G 
in L2(X,m) with |V/„| < G.„ and G <G. 

(b) If Gn G L^{X,m) is a relaxed gradient of fn G L^(X, m) and fn f, Gn G weakly 
in L^(X, m), then G is a relaxed gradient of f . 

(c) In particular, the collection of all the relaxed gradients of f is closed in L^(X, m) and 
there exist Borel 6-Lipschitz functions fn G L^(X, m) such that 

fn^f, |V/n|^|V/|* strongly in L\X,m). (4.6) 

Proof, (a) Since G is a relaxed gradient, we can find Borel d-Lipschitz functions gi G L^(X, m) 
such that gi f in L^(X, m) and \Vgi\ weakly converges to G < G in m); by Mazur's 

lemma we can find a sequence of convex combinations G„ of \Vgi\, starting from an index 
i{n) DO, strongly convergent to G in L^{X,m.); the corresponding convex combinations of 
gi, that we shall denote by /„, still converge in L^(X,m) to / and |V/n| is bounded from 
above by G„. 

(b) Let us prove now the weak closure in (X, m) x {X, m) of the set 

S := {(/,G) € L^(X,m) x L^(X,m) : G is a relaxed gradient for /}. 

Since S is convex, it is sufficient to prove that S is strongly closed. If S" 3 (/*, G*) — )■ {f,G) 
strongly in L^(X, m) x L^(X, m), we can find sequences of Borel d-Lipschitz functions i^fl^n G 
L^(X, m) and of nonnegative functions {G\)n G I?'{X,m) such that 

p^n^p^ Gj,"-^G^ Strongly in L2(X,m), |V/;| < G^, &<G\ 

Possibly extracting a suitable subsequence, we can assume that & ^ G weakly in L? {X, m) 
with G < G; by a standard diagonal argument and the reflexivity of m) wc can also 

find an increasing sequence i i— )■ n{i) such that /^^^^^ — ^ /, |V/^|.^^| ^ H, and G^^^ ^ G in 

L^(X, m). It follows that H < G < G so that G is a relaxed gradient for /. 

(c) Let us consider now the minimal relaxed gradient G := |V/|* and let /„, G„ be 
sequences in L^(X, m) as in the first part of the present Lemma. Since |V/n| is uniformly 
bounded in {X, m) it is not restrictive to assume that it is weakly convergent to some limit 
H G L'^{X,m) with < H < G < G. This implies at once that = G = G and |V/n| 
weakly converges to |V/|* (because any limit point in the weak topology of |V/n| is a relaxed 
gradient with minimal norm) and that the convergence is strong, since 

limsup / |V/„p dm < limsup / G^dm= [ G^dm= [ dm. 

n-^oo Jx n-^oo Jx Jx Jx 

□ 

The distinguished role of the minimal relaxed gradient is also illustrated by the following 
lemma. 

Lemma 4.4 (Locality) LetGi, G2 be relaxed gradients of f . T/ien min{Gi, G2} andXBGi+ 
Xx\bG2, B G .^{X), are relaxed gradients of f as well. In particular, for any relaxed gradient 
G of f it holds 

|V/|* < G m-a.e. inX. (4.7) 
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Proof. It is sufficient to prove that if B G ^{X), then Xx\bGi + XbG2 is a relaxed gradient 
of /. By approximation, taking into account the closure of the class of relaxed gradients, we 
can assume with no loss of generality that X \ B is & compact set, so that the d-Lipschitz 
function 

p{y) :=inf{d(y,x) : x&X\B} 

is r-lower semicontinuous and therefore ^(X)-measurable. Notice that, because of condition 
(iii) in Definition 2.3, p is strictly positive in B and null on X \ B. Therefore it will be 
sufficient to show that, setting Xr '■= minjl, p/r}, XrGi + (1 — Xr)G2 is a relaxed gradient for 
ah r > 0. 

Let now z = 1, 2, be Borel, d-Lipschitz and L^(X,m) functions converging to / in 

as n — oo with \V fn,i\ weakly convergent to Gi < Gi, and set /„ := Xrfn,i + (1 — Xr)fn,2- 
Then (2.9) immediately gives that XrGi + (1 — Xr)G2 > XrG\ + (1 — Xr)G2 is a relaxed 
gradient. 

For the second part of the statement we argue by contradiction: let G be a relaxed gradient 

of / and assume that there exists a Borel set B with xn.{B) > on which G < |V/|*. Consider 
the relaxed gradient GXb + \'^ f\*Xx\B- its L'^ norm is strictly less than the norm of |V/|*, 
which is a contradiction. □ 

By (4.7), for / Borel and d-Lipschitz we get 

|V/|* < |V/| m-a.e. in X. (4.8) 

A direct byproduct of this characterization of |V/|* is its invariance under multiplicative 
perturbations of m of the form 9 m, with 

0<c<6'<C<oo m-a.e. on X. (4.9) 

Indeed, the class of relaxed gradients is invariant under these perturbations. 

Theorem 4.5 Cheeger's functional 

Ch*(/):=^ / |V/|2dm, (4.10) 

set equal to +oo if f has no relaxed slope, is convex and lower semicontinuous in L^(X, m). 
// (4.5) holds, then its domain is dense in L^(X, m). 

Proof. A simple byproduct of condition (2.8a) is that aF+f3G is a relaxed gradient of af + f3g 
whenever a, j3 are nonnegative constants and F, G are relaxed gradients of /, g respectively. 
Taking F = |V/|* and G = \VgU yields 

|V(a/ + /35)|. <a|V/|*+/3|V5U for every /,5 € i^(Ch,), a,/3 > 0. (4.11) 

This proves the convexity of Ch*, while lower semicontinuity follows by (b) of Lemma 4.3. 

□ 

Remark 4.6 (Cheeger's original functional) Our definition of Ch* can be compared with 
the original one in [10] : the relaxation procedure is similar, but the approximating functions 
fn are not required to be Lipschitz and |V/n| are replaced by upper gradients G„ of /„. 
Obviously, this leads to a smaller functional, that we shall denote by Ch^ ; this functional can 
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still be represented by the integration of a local object, smaller m-a.c. than |V/|*, that we 
shall denote by |V/|c. Relating Ch* to Ch ^ amounts to find, for any / € L^(X, m) and any 
upper gradient G of /, a sequence of Lipschitz functions /„ such that /„ ^ / in L^(X, m) 
and 

limsup / |V/„pdm< [ dm. (4.12) 

n^oo Jx J X 

It is well known, see [10], that this approximation is possible (even in strong W^''^ norm) if 
Poincare and doubling hold. 

A byproduct of our identification result, see Remark 5.10 in the next section, is the 
fact that Ch ^ = Ch^,, i.e. that the approximation (4.12) with Lipschitz functions and their 
corresponding slopes instead of upper gradients is possible, without any regularity assumption 
on (X, d,m), besides (4.2). Also, in the case when d is a distance, taking into account the 
locality properties of the weak gradients, the result can be extended to locally finite measures. 
■ 

Remark 4.7 (The Sobolev space W^^'^(X, d,m)) As a simple consequence of the lower 
semicontinuity of the Cheeger's functional, it can be proved that the domain of Ch* endowed 
with the norm 

is a Banach space (for a detailed proof see [10, Theorem 2.7]). Call W^''^{X,A,m) this space. 

This notation may be misleading because, in general, W^''^{X, d, m) is not an Hilbert space. 
This is the case, for example, of the metric measure space (M'', || • H,-^*^) where || • || is any 
norm not coming from an inner product. The fact that VF^'^(X, d, m) may fail to be Hilbert is 
strictly related to the potential lack of linearity of the heat flow, sec also Remark 4.14 below 
(for computations in smooth spaces with non linear heat flows see [29]). Also, the reflexivity 
of W^''^ and the density of Lipschitz functions in W^'^ norm seem to be difficult problems at 
this level of generality, while it is known that both these facts are true in doubling spaces 
satisfying a local Poincare inequality, see [10] . ■ 

Proposition 4.8 (Chain rule) /// G L^(X, m) has a relaxed gradient, the following prop- 
erties hold: 

(a) for any -negligible Borel set N cR it holds |V/|* = m-a.e. on f~^{N); 

(b) |V/|* = iV^ij* m-a.e. on {f — g = c} for all constants c G M and g G L^(X, m) with 
Ch^{g) < go; 

(c) 4>{f) ^ -D(Ch^,) and |V0(/)|* < |^5''(/)||V/|* for any Lipschitz function (p on an interval 
J containing the image of f (with G J and (f>{0) = Oifm.is not finite); 

(d) (pif) G £)(Ch*) and {^(piDl* = 0'(/)|V/|* for any nondecreasing and Lipschitz function 
(j) on an interval J containing the image of f (with G J and (f)(0) = ifm is not finite). 

Proof, (a) We claim that for ^ : R ^ R continuously differentiable and Lipschitz on the 
image of / it holds 

|V0(/)|* < 10' o /IIV/U, m-a.e. in X, (4.13) 

for any / G L>(Ch^,). To prove this, observe that the pointwise inequality |V0(/)| < |(^' o 
/||V/| trivially holds for / G L^(X, m) Borel and d-Lipschitz. The claim follows by an easy 
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approximation argument, thanks to (4.6) of Lemma 4.3; when m is not finite, we also require 
^(0) = in order to be sure that (f) o f £ L^{X,m). 

Now, assume that N is compact. In this case, let An C M be open sets such that \. N. 
Also, let V'n : [0,1] be a continuous function satisfying Xn < ipn ^ Xa„j and define 

^„ : M ^ R by 

0n(O) =0, 
€i{z) =l-i'n{z). 

The sequence uniformly converges to the identity map, and each (p^ is 1-Lipschitz and 
C^. Therefore <j>n° f converge to f m. L?. Taking into account that (p^ = on N and (4.13) 
we deduce 

/ |V/|2dm<liminf / |V<^„(/)|2 dm < liminf / |<^; o /|2|V/|2 dm 
Jx Jx Jx 

= liminf/ |<o/|2|V/|^dm< / |V/|^dm. 

Jx\f-^N) Jx\f-^{N) 

It remains to deal with the case when N is not compact. In this case we consider the measure 
H := /jm G ^(M), where m = t?m is the finite measure defined as in (4.1). Then there 
exists an increasing sequence (Kn) of compact subsets of N such that iJi,{Kn) t l^i^)- By the 
result for the compact case we know that |V/|* = m-a.e. on U„/~^(i^'„), and by definition 
of push forward and the fact that m and m have the same negligible subsets, we know that 
m{f-^{N\UnKn)) = 0. 

(b) By (a) the claimed property is true if g is identically 0. In the general case we notice 
that |V(/ — g)\^ + \Vg\^ is a relaxed gradient of /, hence on {f — g = c} we conclude that 
m-a.e. it holds |V/|* < |V(7|*. Reversing the roles of / and g we conclude. 

(c) By (a) and Rademacher Theorem we know that the right hand side is well defined, so 
that the statement makes sense (with the convention to define \4>' o f\ arbitrarily at points x 
such that 0' does not exist at f{x)). Also, by (4.13) we know that the thesis is true if (p is 
C^. For the general case, just approximate (j) with a sequence (0„) of equi-Lipschitz and 
functions, such that (p^ ^ (p' a.e. on the image of /. 

(d) Arguing as in (c) we see that it is sufficient to prove the claim under the further assumption 

that (p is C^, thus we assume this regularity. Also, with no loss of generality we can assume 
that < (p' < 1. We know that (1 - (p'{f))\Vf\* and 0'(/)|V/|* are relaxed gradients of 
/ ~ ^(/) and / respectively. Since 

|V/|. < |V(/ - <^(/))|* + |V<^(/)|. < ((1 - cp'if)) + (P'if)) |V/|. = |V/|. 

it follows that all inequalities are equalities m-a.e. in X. □ 

Taking the locality into account, we can extend the relaxed gradient from L^(X, m) to 
the class of m-measurable maps / whose truncates /at := min{iV, max{/, —N}} belong to 
-D(Ch*) C L^(X,m) for any integer N in the following way: 

|V/|. := IV/atI* m-a.e. on {|/| < N}. (4.14) 

Accordingly, we can extend Cheeger's functional (4.10) as follows: 

Ch,(/):=/5/xlV/l^dm if G D(Ch.) for all iV > 1 ^^^^^ 
I +00 otherwise. 
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It is obvious that Ch* is convex and, when m{X) < oo, it is sequentially lower semicontinuous 
with respect to convergence m-a.e. in X: we shall see that this property holds even when m 
is not finite but satisfies (4.2). We shall use this extension when we will compare relaxed and 
weak upper gradient, see Theorem 6.2. 

Here it is useful to introduce the Fisher information functional: 

Definition 4.9 (Fisher information) We define the Fisher information F(/) of a Borel 
function / : X — >■ [0, oo) as 

Hf)-=^[ |Vv/7|2dm = 8Ch,(v/7), (4.16) 
Jx 

if y/f G D(Ch*) and we define F(/) = +oo otherwise. 

Lemma 4.10 (Properties of F) For every Borel function f : X ^ [0, oo) we have the 
equivalence 

feD{F) ^ f,\VfUeL\X,m), [ I^dtrKoo, (4.17) 

-'{/>o} / 

and in this case it holds 

F(/)= / ^dm. (4.18) 
^{/>o} / 

In addition, the functional F is convex and sequentially lower semicontinuous with respect to 
the weak topology of L^{X,xn). 

Proof. By the definition of extended relaxed gradient it is sufficient to consider the case when 
/ is bounded. The right implication in (4.17) is an immediate consequence of Proposition 
4.8 with (j){r) = r^. The reverse one still follows by applying the same property to 4>s{r) = 
"v/r+T — e > 0, and then passing to the limit as £ J, 0. 

The strong lower semicontinuity in L^{X,m) is an immediate consequence of the lower 
semicontinuity of the Cheeger's energy in L'^[X,m). The convexity of F follows by the repre- 
sentation of F given in (4.18), the convexity of g ^ \^9\* stated in (4.11), and the convexity 
of the function {x,y) y'^/x in (0, oo) x M. Since F is convex, its weak lower semicontinuity 
in {X, m) is a consequence of the strong one. □ 

We conclude this section with a result concerning general multiplicative perturbations of 
the measure m. Notice that the choice 6 = with V as in (4.2) implies (4.19) for arbitrary 
r > 0. 

Lemma 4.11 (Invariance with respect to multiplicative perturbations of m) Letm' = 
9 m be another a-finite Borel measure whose density 9 satisfies the following condition: for 
every K compact in X there exist r > and positive constants c{K) , C (K) such that 

< c{K) <9< C{K) < oo m-a.e. on K{r) := {x e X : d(x, K) < r}. (4.19) 

Then the relaxed gradient |V/|^ induced by m' coincides m-a.e. with |V/|* for every f G 
VF^'^(X, d,m)nl^^'^(X, d,m'). If moreover there exists r > such that (4.19) holds for every 
compact set K G X then 

feW''\X,d,m), /,|V/|.GL2(X,m') =^ f eW'^\X,d,m')- (4.20) 
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Proof. Let us first notice that the role of m and m' in (4.19) can be inverted, since also 
m is absolutely continuous w.r.t. m' ((4.19) yields m{K) = for every compact set K with 
m'{K) = 0) and therefore its density dm/dm' = 9^^ w.r.t. m' still satisfies (4.19). 

Let us prove that |V/|* < |V/|'^: we argue by contradiction and we suppose that for some 
/ G W^''^{X,d,m) n VF^'2(X,d,m') the strict inequality |V/|* > |V/|'^ holds in a Borel set B 
with m'{B) > 0. 

By the regularity of m' we can find a compact set C .B with m'{K) > (and therefore 
m{K) > by (4.19)) and r > such that (4.19) holds. Introducing a Lipschitz real function 
(f)r : M. ^ [0,1] such that (l)r{v) = 1 in [0,r/3] and (j)r{v) = in [2r/3, oo), we consider the 
corresponding functions Xr{x) := (pri^ix, K)), which are lower semicontinuous, d-Lipschitz, 
and satisfy Xr{x) = \VXr{x)\ = for every x with d{x,K) > 2r/3. 

Applying Lemma 4.3 we find a sequence of Borel and d-Lipschitz function /„ G L^(X, m) 
satisfying (4.6). It is easy to check that := Xr fn is a sequence of Borel d-Lipschitz functions 
which converges strongly to /' := Xr f in L'^{X,m') by (4.19). Moreover, since 



|V/;| <Xr|V/n| + |/„|Lip(X,) and |V/;| = on the open set X \ K(r), (4.21) 



|V/4| is clearly uniformly bounded in L'^{X, m') by (4.19) , so that up to subsequence, it weakly 
converges to some function G' > |V/|'^. Since |V/4| = |V/n| in a d-open set containing K, 
(4.6) yields G' = |V/|* m'-a.e. in K so that |V/|'^ < |V/|* m'-a.e. in K. Inverting the role of 
m and m', we can also prove the converse inequality |V/|'^ < |V/|*. 

In order to prove (4.20) , let Kn be an sequence of compact sets such that Xk^ t 1 n — >■ oo 
m-a.e. in X (recall that the finite measure m = i?m defined by (4.1) is tight); by the previous 
argument and (4.19) (which now, by assumption, holds uniformly with respect to Kn) we 
find a sequence Xn{x) := 0,.(d(a:, i^„)) uniformly d-Lipschitz such that Xnf € W^''^{X,d,m'). 
Since Xnf converges strongly to / in L'^{X,m') and (4.21) yields |V(X„/)|* < |V/|* + f |/|, 
we deduce that |V(X„/)|* = |V(Xn/)|l is uniformly bounded in L^(X, m'); applying (b) of 
Lemma 4.3 we conclude. □ 

Remark 4.12 Although the content of this section makes sense in a general metric measure 
space, it should be remarked that if no additional assumption is made it may happen that 
the constructions presented here are trivial. 

Consider for instance the case of the interval [0, 1] C M endowed with the Euclidean distance 
and a probability measure m concentrated on the set {gnjneN of rational points in (0, 1). For 

every n > 1 we consider an open set An D Q H (0, 1) with Lebesgue measure less than 1/n 
and the 1-Lipschitz function jn{x) = ^^{{[0,x] \ An), locally constant in An- If / is any 
L-Lipschitz function in [0, 1], then /„(.x) := f{jn{x)) is still L-Lipschitz and satisfies 



Since jn{x) x, fn ^ f strongly in L'^{[0, 1]; tn) as n — >■ oo and we obtain that Ch*(/) = 0. 
Hence Cheeger's functional is identically and the corresponding gradient flows that we shall 
study in the sequel are simply the constant curves. 

Another simple example is X = [0, 1] endowed with the Lebesgue measure m and the dis- 
tance d(x, y) := |y — .r]^/^. It is easy to check that |V/|(x) = for every / G C^([0, 1]) (which 
is in particular d-Lipschitz), so that a standard approximation argument yields Ch*(/) = 




for every / G 



([0, l];m). 
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4.2 Laplacian and gradient flow of Cheeger's energy 

In this subsection we assume, besides cr-finiteness that the measure m satisfies the condition in 
(4.5) (weaker than (4.2)), so that the domain of Ch* is dense in L^(X, m) by Proposition 4.1. 

The Hilbertian theory of gradient flows (see for instance [8], [4]) can be applied to Cheeger's 
functional (4.10) to provide, for all /o G L'^{X,m), a locally Lipschitz map t ^ ft = ^t{fo) 
from (0, oo) to L'^{X, m), with ^ /o as t 4- Oj whose derivative satisfies 

^/t G -d-CK{ft) for a.e. t G (0, oo). (4.22) 

Recall that the subdifFerential 0~Ch* of convex analysis is the multivalued operator in 
L^(X, m) defined at all / G £)(Ch*) by the family of inequalities 

ied-QKU) ^ I %-/)dm< Ch,(5) -Ch*(/) for every 5 G (X, m) . (4.23) 

Jx 

The map Ht : /o 1-^ ft is uniquely determined by (4.22) and defines a semigroup of contractions 
in L^(X, m). Furthermore, we have the regularization estimate 

Ch,(/i) <inf|ch,(5) + l^|3-/o|2dm: 5 G ^^''(X, d, m)| . (4.24) 

Another important regularizing effect of gradient flows lies in the fact that, for every f > 0, 
the right derivative ^/t exists and it is actually the element with minimal L^(X, m) norm 
in 5~Ch*(/f). This motivates the next definition: 

Definition 4.13 ((d, m)-Laplacian) The Laplacian Ad^^/ of f & L^(X, m) is defined for 
those f such that 9~Ch*(/) / 0. For those f , — Ad,m/ is the element of minimal L^(X, m) 
norm in d~Ch^{f). 

The domain of Ad,m will be denoted by D(Ad,m)) since there is no risk of confusion with 
the notation (2.5) introduced for extended real valued maps; in this connection, notice that 
convexity and lower semicontinuity of Ch* ensure the identity D(Ad ,„) = D(|V~Ch*|), see [4, 
Proposition 1.4.4]. We can now write 

d+ 

-77 ft = Ad,m/i for every t G (0, 00) 
at 

for gradient flows ft of Ch^,, in agreement with the classical case. However, not all classical 
properties remain valid, as illustrated in the next remark. 

Remark 4.14 (Potential lack of linearity) It should be observed that, in general, the 
Laplacian we just defined is not a linear operator: the potential lack of linearity is strictly 
related to the fact that the space W^'^{X,d,m) needs not be Hilbert, see also Remark 4.7. 
However, the Laplacian (and the corresponding gradient fiow H^) is always 1-homogeneous, 
namely 

Ad,m(A/) = AAd,„^/, HtiXf) = XHtif) for all / G -D(Ad,„^)) and A G M. 

This is indeed a property true for the subdifferential of any 2-homogeneous functional to 
prove it, if A 7^ (the case A = being trivial) and ^ G d^{x) it suffices to multiply the 
subdifferential inequality $(A~^y) > ^{x) + X~^y - x) by A^ to get A^ G d<^{Xx). ■ 
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Proposition 4.15 (Some properties of the Laplacian) For all f G D(Ad ,^), g G -D(Ch*) 
it holds 

- [ 5Ad,m/dm< / |V5|*|V/|*dm. (4.25) 
Jx JX 

Also, let f G D(Aci,m) and : J — )• R Lipschitz, with J closed interval containing the image 
off (^0(0) = if m'iX) = oo). Then 

- [ ,/.(/) Ad,:n/ dm = / \Vf\lcP' if) dm. (4.26) 
Jx Jx 

Proof Since -Ad,^/ e d'CKif) it holds 

CKif)- [ egAd,r^fdm<CK{f + eg) Ve G M. 

For e > 0, |V/|* + e|V5(|* is a relaxed gradient of / + eg. Thus it holds 2Ch*(/ + eg) < 
/^(|V/|* + elV^I*)^ dm and therefore 

-[ £5Ad,m/<^/ [{\VfU + e\VgUf-\Vf\l)dm = e[ \V fU\VgU dm + o{e). 

Dividing by £, letting £ | we get (4.25). 

For the second part we recall that, by the chain rule, |V(/ + £0(/))|* = (1 + £</''(/)) |V/|* 
for |£| small enough. Hence 

Ch,(/ + £<^(/))-Ch,(/) = i / |V/|2((l + £<^'(/))2-i)dm = £ I |V/|20'(/)dm + o(£), 

•/ X J X 

which implies that for any v G 5~Ch*(/) it holds v<j){f) dm = |V/|^^'(/) dm, and gives 
the thesis with v = — Ad,m/- D 

Theorem 4.16 (Comparison principle, convex entropies and contraction) Let ft = 

Ht(/o) be the gradient flow o/Ch* starting from fo G L^(X, m). 

(a) Assume that fo<C (resp. /o > c). Then ft<C (resp. ft > c) for every t >0. 

(b) If go G L'^{X,m) and /o < + c m-a.e. in X for some constant c > 0, then ft < gt + c 
m-a.e. in X, where gt = ^tido) is the gradient flow starting from go. In particular the 
semigroup : L^(X, m) — >■ L^(X, m) satisfies the contraction property 

||Ht(/o) - Ht(go)||LP(X,m) < ll/o - 9o\\LP{X,m) V fo, go e L\X, m) n LP{X, m) (4.27) 

for every p G [2,oo]. 

(c) //e : M — >■ [0, oo] is a convex lower semicontinuous function and E{f) := J^e(/)dm is 
the associated convex and lower semicontinuous functional in {X, m) it holds 

E{ft)<E{fo) for every t>0. (4.28) 

In particular, if p G [1, oo] and fo G L^{X,m), then also ft G L*'(X,m). Moreover, if e' 
is locally Lipschitz in M and E{fo) < oo, then we have 

E{ft) + f I e"{fs)\Vfs\l dmds = E{fo) Vt > 0. (4.29) 
Jo Jx 
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(d) When vci{X) < oo we have 

/ /t dm = / /o dm for every t>0, (4.30) 
Jx Jx 

and the evolution semigroup satisfies (4.27) also for p G [1,2] and, more generally, 
for E as in (c) it holds 

E{ft-gt)<E{fo-go). (4.31) 

Proof. By [8, Theorem 4.4], a convex and lower semicontinuous functional E on L^(X, m) is 
nonincreasing along the gradient flow generated by Ch^, if and only if for every / € L^(X, m) 
with E{f) < oo and A > we have E{f^) < E{f), where is the unique minimizer of 

u ^ CK{u) + - [ |n-/pdm. (4.32) 

2 Jx 

Notice that f^ is also the unique solution of 

/^-AAd,™/^ = /. (4.33) 

In order to prove (a) , wc can apply the classical Stampacchia's truncation argument and prove 
that f < C entails f^ < C. Indeed, if this is not the case we can consider the competitor 
g := min{/^, C} in the above minimization problem. Its Cheeger energy is less or equal than 
the one of /"^ (by applying Proposition 4.8 (d)) and the LF' distance between / and g is strictly 
smaller than the one between / and /^, if m({/^ > C}) > 0. The same arguments applies to 
uniform bounds from below. 

In order to prove the first part of (b) it suffices to show a discrete maximum principle 
analogous to (a) (but, notice that constants need not be in L^(X, m)), i.e. if f'^ is as above 
and g^ minimizes u ^ 2Ch^,(«) + A||u — ^olli) then f^<g^ + c m-a.e. in X. Indeed, iterating 
this estimate the convergence of the Euler scheme to gradient flows provides the result. We 
can assume with no loss of generality that /o < 50 + c m-a.e. in X. 

Let := min{/^,c/^ + c}, g^ := max{/^ - c,g^}, A = {f^ > g^ + c}, B = X \ A. 
Notice that f^ = f^- [f^ - g^ - c)+ belongs to W^''^{X, dm) by (c) of Proposition 4.8 (here 
(j){r) = (r — c)+ and c > 0), and the same property holds for g^. Our goal is to show that 
m{A) = 0; to this aim, adding the inequalities 

Ch.(/^) + \j^\f^- /ol' dm < Ch.(/^) + ^ ^ 1/^ - /o|2 dm, 

Ch*(5^) + ^/ |5^-5o|'dm< Ch,(5^) + ^ /" l^^-^ol'dm 

and taking the identity Ch*(/'^) + C\\^{g^) = On^{f^) + Ch*(^^) into account by Proposi- 
tion 4.8(b), we get 

/ l/^-/o|' + |5^-5o|'dm+ / |/^-/o|2 + |5^-5o|2dm 

J A Jb 

< [ \9^ + c- /op + \f^-c- 5oP dm + / |/^ - /op + \g^ - go\^ dm, 
J A Jb 
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so that, simplifying and rearranging terms we get 



2 / ifo - go - c){g^ + c- f^) dm <0. 

J A 



Since fo<go + c and + c < in A, this can happen only if m{A) = 0. 

The estimate (4.27) for p = oo follows immediately by applying the previous comparison 
principle with c := ||/o — go\\L°°{x,m) then reverting the role of ft and gt in the inequality. 
When p = 2 (4.27) is the well known contraction property for gradient flows of convex 
functionals in Hilbert spaces; the general case p G (2, oo) follows then by interpolation, sec 
[37, 32]. 

In order to prove the first claim of (c) we can assume -E'(/o) < oo. Assuming temporarily 
e' to be Lipschitz, we can multiply (4.33) by e'{f'^) G L^(X, m) (notice that if m(X) = oo 
then e(0) = 0, since -E(/o) < oo, and e'(0) = 0, since is a minimum point for e), obtaining 
after an integration and the application of (4.26) 

/ /V(/^)dm + A/ e"(/^)|V/^|^dm= / /e'(/^)dm, 
Jx Jx Jx 

so that, by the convexity of e, 

E{f'^)-E{f)< [ (/^-/)e'(/^)dm<0. 

Jx 

A standard approximation, replacing e(r) by its Yosida approximation yields the same result 
for general e. 

The second claim of (c) follows by a similar argument: notice that by (a) we know that 
the image of ft is contained in the same interval containing the image of /q. We can also 
assume, by truncation, that this interval is closed, bounded, and that e' is Lipschitz in it (the 
interval also contains if m{X) = oo and e'(0) = 0). Also, we know that the map t ^-^ ft is 
locally Lipschitz in (0, oo) with values in L^{X,m.), the same is true for the map t e{ft) and 
TAft) = e'{ft)mft = e'{ft)Ad,mft. Thus the map t ^ E{ft) is locally Lipschitz in L'^{X,m) 
and using equation (4.26) with (j) = e' we get 

^ eift) dm = ^ e'{ft)Ad,mftdm = - e"{ft)\Vft\l dm. (4.34) 

In order to prove (d) we notice that m{X) < oo allows to the choice oi g = ±1 in (4.25), 
to obtain Ad,m^dm = for all h e D(Ad,m)- Hence (4.30) follows by 

^ / ft dm = I Ad,ni/tdm = 0. 



X Jx 



The contraction property (4.27) when p = 1 follows now by the classical argument of Crandall 
and Tartar [11, Prop. 1]; (4.31) can be obtained by applying [9, Lemma 3] to the map 
T{h) := Ht{h + go)) - ^t{go) and choosing h := fo - go- □ 
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4.3 Increasing family of measures and veiriational approximation of Cheeger's 
energy 

In this section we study a monotone approximation scheme for the Cheeger's energy and its 
gradient flow, which turns to be quite useful when m{X) = oo and one is interested to extend 
the validity of suitable estimates, which can be more easily obtained in the case of measures 
with finite total mass. 

Let us consider an increasing sequence of a-finite, Borel measures tn^ < < • • • < m'^ < 
xn^+^ < • • • converging to the limit measure m in the sense that 

lim m''{B) = m{B) for every B G ^{X). (4.35) 

fc— >-c» 

dtn 

Let us assume that, as in (4.19), m ^ m° with density 6 = - — ^ satisfying 

dm^ 

< c{K) <9< C{K) < oo m°-a.e. on K{r) := {x e X : 6{x, K) < r} (4.36) 

for any compact set K C X, with r = r{K) > 0. Notice that the measures m*^ share the same 
collection of negligible sets and of measurable functions. We denote by 'K^ := Lp'{X,m^) 
and by ChJ the Cheeger's energy associated to m'^ in W^-''^{X,6,m^) C "K^, extended to 
+00 in \ T^i'2(X,d,m'=). We have C Oi'^ C ?{°, with continuous inclusion and, by 

Lemma 4.11, Ch^ < Ch^+\ 

Proposition 4.17 (F-convergence) Let {m^) he an increasing sequence of a-finite mea- 
sures satisfying (4.35) and (4.36). // G weakly converge in "K^ to f with S := 
limsup;,/^ l/'^pdm'^ < oo then f G L'^{X,m), 

liminf / |/'=pdm'=> / |/|^dm, liminf Ch^(/*^) > Ch*(/), (4.37) 

and 

lim [ fgdm'' = [ fgdm for every g £ L'^{X,m). (4.38) 

k-^°o Jx Jx 

Finally, if S < J-^ |/pdm then 

f'^^f strongly in and lim / |/Ydm*^=/ |/|^dm. (4.39) 

f^-^^o Jx Jx 

Proof. (4.37) is an easy consequence of the monotonicity of m'^, the lower semicontinuity of 
the L^-norm with respect to weak convergence, and (4.20) of Lemma 4.11. 
In order to check (4.38) notice that for every g G L^(X, m) and every v > 

I f^gdm^=^- [ {vfk + y-^gf^ra^_'^ f \fk\2 ^^k _ J_ f \g\2 dra\ (4.40) 
so that, taking the limit as A; — >■ oo, 

liminf / f^gdm^>- I ivf + v'^gf dm- —S - ^ I \g\^ dm 
= ^/5dm+^(^|/pdm-5). 
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Passing to the limit as v I and applying the same inequality with —g in place of g we get 
(4.38). Finally, (4.39) follows easily by (4.38) and the inequality S < J-^ |/pdm, passing to 
the limit in 

[ If'' - ff dm'' = -2 f /Vdm*^+ / \f''f dm'' + [ \ffdm''. 
Jx Jx Jx Jx 

□ 

Let us now consider the gradient flow generated by Ch^ in 'K'' and the "limit" semigroup 
Ht generated by Ch^, in 3i = L^{X,m) C 'K^. Since any element /o of belongs also to 
Ji'', the evolution f^ := (/o) is well defined for every A; and it is interesting to prove the 
convergence of f^ to ft = ^t{fo) as A; ^ oo in the larger space !K°. 

Theorem 4.18 Let fo G L'^{X,m) C ?f° and let f^ = H^{fo) e be the heat flow in 
L'^{X,m''), ft := Ht(/o) G L'^{X,m). Then for every t>0 we have 

lhnft^ = ft strongly in lim / \f^\^ dm" = f \ft\^ dm. (4.41) 

Proof. The following classical argument combines the F-convcrgcncc result of the previous 
proposition with resolvent estimates; the only technical issue here is that the gradient flows 
are settled in Hilbert spaces 'K'' also depending on k. 

Let us fix A > and let us consider the family of resolvent operators : 'K'' — > 'K'' which 
to every f'' G 'K'' associate the unique minimizer of 

Qi{g- f") := Q\^l{g) + ^j^\g-f^\^ dm". (4.42) 

We first prove that if limsup;;. \ f''\^ dm < |/p dm then := J^if'^) converge to fx := 
J\{f) as k ^ oo according to (4.39). In fact we know that for every g G W^'^{X,d,m) 

C\^Ufx) + ^J^ I/a' - /'I' dm^ < Ch"M + ^ J^\9 - f'\' dm". 

By the assumption on f'' the right hand side of the previous inequality converges to Ch^[g) + 
^ Jx\9 ~ /Pdm. Since the sequence {f") is uniformly bounded in = L^(X,mo), up to 
extracting a suitable subsequence we can assume that f" weakly converge to some limit / in 
Ji^; (4.37) yields 

Ch*(/) + ^/ |/-/|2dm<liminfCh^(/j^) + ^ / lA'-Z'^drnfe 

2 Jx fc-^oo 2 Jx 

< Ch^g) + ^J^\g-ffdm = exig; /), 

for every g G W^''^{X, d, m). We deduce that / = Jxf is the unique minimizer oi g i-^ ^xis'i /)• 
In particular the whole sequence converge weakly to Jxf in and moreover 

lim sup / 1/^^ - f^f dm" < [ \fx- /I' dm, (4.43) 

fe-i-oo Jx Jx 

SO that we can apply Proposition 4.17 and obtain (4.39) for the sequence if")- 



36 



Iterating this resolvent convergence property, we get the same result for the operator 
(J^)" obtained by n iterated compositions of J^', for every n G N. By the general estimates 
for gradient flows, choosing A := n/t, we know that 

/ |iJi/o-(J„/tr/opdm<-Ch,(/o), / |i7,Vt-(Jj^/tr/o|'dm'=<-Ch^(/o)<-Ch,(/o). 

Jx n Jx ' n n 

Since for every n and every t > wc have limfc^oo( J^yJ"/ = {Jn/t)"'f strongly in !K^, combin- 
ing the previous estimates we get the first convergence property of (4.41) when Ch*(/o) < oo. 
Since the domain of Ch* is dense in L^(X, m) and is a contraction semigroup, a simple 
approximation argument yields the general case when /o G L^(X,m). Passing to the limit as 
A; — >■ oo in the identities 

^ 1^ \f,'\' dm'^ + 1^ Chtift) ds = ^J^ l/oP dm^ (4.44) 

and taking into account the corresponding identity for m and Ch* and the lower semicontinuity 
property (4.37) for Ch^', we obtain the second limit of (4.41). □ 



4.4 Mass preservation and entropy dissipation when tn(X) = oo 

Let us start by deriving useful "moment-entropy estimates" , in the case of a measure m with 
finite mass. 

Lemma 4.19 (Moment-entropy estimate) Letm be a finite measure, letV : X [0, oo) 
be a Borel and d-Lipschitz function, let fo G L'^{X,m) be nonnegative with 

[ e"^" dm < / /odm, / /odm < oo, (4.45) 
Jx Jx Jx 

and let ft = ^tifo) be the solution of (4.22). Then the map t J-^V^ ft dm is locally 
absolutely continuous in [0, oo) and for every t>0 

ft dm < e^Lip^CV)* log + 2V^) dm, (4.46) 

f f \1M dm ds < 2e^^^P'(^)* / fo ( log fo + V') dm. (4.47) 

Jo J{fs>0} Is Jx ^ ■' 

Proof. We set L = Lip(T^) and 

M^{t):= f ft dm, E{t):= [ f tlog ft dm, F\t) := [ dm. (4.48) 

Jx Jx J{ft>o} Jt 

Applying (4.29) to {ft + e) and letting e | we get F G L'^{0, T) for every T > with 

^E{t) = -F^{t) a.e. in(0,T). (4.49) 

The convexity inequality r logr > r—ro+r\ogro with r = ft, ro = e~^^, and the conservation 
of the total mass (4.30) and (4.45) yield for every t>0 

Eit)> [ {ft -e'""') dm -M\t)= [ { fo - e-""' ) dm - M\t) >- M\t) . 
Jx Jx 
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We introduce now the truncated weight Vk{x) = m.m{V{x),k) and the corresponding func- 
tional M|(t) defined as in (4.48). Since the map t i->- M|(t) is Lipschitz continuous we get 
for a.e. t > 



d_ 

dt' 



Miit) = [ Ffe2Ad,„/tdm <2 [ \VftU\VVkUVkdm<2LF{t)Mk{t). (4.50) 
Jx Jx 



We deduce that 



Mk{t) < Mk{0) + L [ F{s)ds<M{0) + L [ F(s)ds, 
Jo Jo 

so that Mi.(t) is uniformly bounded. Passing to the limit in (an integral form of) (4.50) as 
A; — >■ GO by monotone convergence, we obtain the same differential inequality for M 



^M2(t) <2LF{t)M{t). 
Combining with (4.49) we obtain 

— (E + 2M^) +F^ <4LFM KF^ + AL"^ M^. 
dt ^ ' 

Since E + 2M^ > M^, Gronwall lemma yields 

M^ii) < E{t) + 2M'^{t) < {E{0) + 2M2(0))e^-^'*, 

i.e. (4.46). Integrating now (4.49) we get J^F^{s)ds < E{0) - E{t) < E{0) + M^{t), which 
yields (4.47). □ 

We want now to extend the validity of (4.30), (4.46) and (4.47) to the case when xn{X) = 
oo, at least when (4.2) holds. Notice that this assumption also includes the cases when 
m{X) e {0,oc). 

Theorem 4.20 If m is a a -finite measure satisfying (4.2), then the gradient flow of the 
Cheeger's energy is mass preserving (i.e. (4.30) holds) and satisfies the contraction estimate 
(4.31), in particular (4.27) for every p G [l,oo]. Moreover, for every nonnegative /o G 
L^(X, m) with 

/olog/o GLi(X,m), / yVodm<oo, / /odm = l, (4.51) 

Jx Jx 

the solution ft = ^t{fo) of (4.22) satisfies (4.46) and (4.47) for every t>0. 

Proof. Let us first prove mass preservation and (4.46), (4.47) for a nonnegative initial datum 
satisfying (4.51). The proof is based on a simple approximation result. We set 
Vfc := min(y, k) and m'^ := e^fc m'' = e^fe ~^ m, so that that m'^ is an increasing family of finite 
measures satisfying conditions (4.35), by monotone convergence. In addition, since by (4.2) 
V is d-Lipschitz and bounded from above on compact sets, (4.36) holds. 

We define = Hf^(/o) as in the Theorem 4.18 and Zk := /^/odm'=. We apply (4.30) 
to obtain that J-^ f^ dm^ = for all t > 0; then, since t 1 we can find r]^ i l such that 
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/ e ^k^"^ dm'^ < Zk and apply the estimates of Lemma 4.19 with m := m*' and weight ■q^V to 
obtain 



[ Am^ < e^'J^ip'^^)* / /o(log /o + 2V^) dm^ < e^^'^ip^^^)* / /o(log /o + 2V^) dm. 

Jx Jx Jx 

(4.52) 

Since, thanks to (4.39), — >■ ft strongly in L^(X,m°) as A; — >■ oo, we get up to subsequences 
ft ft m-a.e., so that Fatou's lemma and the monotonicity of m'^ yield 



ft dm < lim inf / f^ dm^ 



fe— ^oo 



and (4.46) follows by (4.52). 

Let us consider now := {x ^ X : V{x) < h} and observe that (4.2) and (4.38) yield 

m{Ah)< [ e^'-^' dm < e'*' < oo, / /tdm=lim / ft dm''. 
Jx J An k-^°°JAh 

From (4.46) we obtain for every t > a constant C satisfying h"^ Ix\Af^ ft dTTi*^ < C for every 
/i > 0, so that 

[ ftdm> [ /t dm = lim / /'^ dm'^ > 1 - limsup / f^ dm'^ > 1 - C/h^. 

Jx J Ah ^-^°°JAh k^oo Jx\Ah 

Since h is arbitrary and the integral of ft does not exceed 1 by (4.28), we showed that 
j-^ ft dm = 1. Finally, (4.47) follows now by the lower semicontinuity (4.37) of the Cheeger's 
energy from the corresponding estimate for f^, recalling (4.18). 

Let us now consider an initial datum /o € L? (X, m) with arbitrary sign and vanishing 
outside some An, so that |/o| satisfies (4.51) (up to a multiplication for a suitable constant). 
The comparison principle yields |Hj(/o)| < Hj(|/o|), so that for every t > there exists a 
constant C such that h? Sx\Ah \ft\ ^^'^ — Since f^ dxal^ = /-^ /o dm'^ by (4.30), we thus 
have 

/ (/t-/o)dm < / |/i'|dm'=+ / |/t|dm+ / /jdm- / f'^ dm' 

Jx Jx\Ah Jx\Ah JAh J Ah 

+ / |/o|d(m-m*^). 

Jx 

Passing to the limit in the previous inequality first as A; — t- oo, taking (4.38) into account, 
and then as /i — )• oo we obtain that the integral of ft is constant in time. As in the proof of 
Theorem 4.16(d) we can show that Hi satisfies the contraction estimate (4.27) for p = 1 and 
arbitrary couples of initial data vanishing outside An- Approximating any /o € L^(X, m) n 
L^{X,m) by the sequence X^n/o we can easily extend the contraction property and the mass 
conservation to arbitrary initial data. 

The contraction property (4.31) (and (4.27) for p G (1, 2)) then follows as in the proof of 
Theorem 4.16, (d). □ 



Remark 4.21 It is interesting to compare the mass preservation property of Theorem 4.20 
relying on (4.2) with the well known results for the Heat flow on a smooth, complete, finite 
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dimensional, Riemannian manifold (X, d,m), where d (resp. m) is the induced Riemannian 
distance (resp. volume measure). In this case, a sufficient condition [18, Theorem 9.1] is 



'ro 



log (m(r)) 



dr = oo, for some ro > 0, m(r) := m({a; : d(x, xq) < r}) , (4.53) 



which is obviously a consequence of (4.3). On the other hand, (4.3) is always satisfied if the 
Ricci curvature of X is bounded from below: more generally (4.3) holds in metric spaces 
satisfying the CD{K, oo) condition, see Section 9 and [35, Theorem 4.24]. 

Proposition 4.22 (Entropy dissipation) Let m be a a-finite measure satisfying (4.2), let 
/o G {X, m) be a nonnegative initial datum with j-^ /o dm = 1 and j-^ V'^ /o dm < oo and let 
(ft) be the corresponding gradient flow of Cheeger's energy. Then the map 1 1-^ Jx ft /* 
is locally absolutely continuous in (0, oo) and it holds 

ft log ftdm=- [ dm for a.e. t e (0, oo). (4.54) 

Jx J{ft>o} Jt 

Proof. The case when m{X) < oo can be easily deduced by Proposition 4.16(c). If m{X) = oo 
we first consider regularized C^'^(0, oo) and convex entropies e^, < £ < e~^, of e(r) := rlogr: 

{ee(?") = (1 + log£)r = e'{e)r in [0, e], 

^eif) = rlogr + £ = e(r) — e{e) + ee'{£) in [e, oo). 

Notice that e'^{r) = max{e'(r), e'(e)} < (1 + logr)"*" because of our choice of e; since (1 + 
logr)+ < r, we deduce that e(r) < ee(r) < ^r^ and ee(r) J, e(r) as e | 0. 

We can now define a convex and C^'"'^(M) function by setting ee(r) := ee{r) — (l + loge)r for 
r > and ee{r) = for r < 0; applying (4.29) (by the previous estimates ^^{fo) dm < oo) 
and recalling that the integral of ft is constant for every i > we obtain 

/ e,(/t)dm+ /* / r^dmdi= / e,(/o)dm. 

JX Jo J{ft>e} Jt Jx 

Passing to the limit as e 4 and recalling the uniform bounds (4.46) and (4.47), guaranteed 
by Theorem 4.20, we conclude. □ 



Remark 4.23 Although these facts will not play a role in the paper, we emphasize that 
it is also possible to define one-sided Cheeger energies Ch;^(/), Ch~(/), by relaxing respec- 
tively the ascending and descending slopes of Borel and d-Lipschitz functions w.r.t. L'^{X, m) 
convergence. We still have the representation 

Ch+(/) = \l^ \V+f\l dm, Ch-(/) = ^ ^ \V-f\l dm 

for suitable one-sided relaxed gradients |V^/|* with minimal norm and it is easily seen that 
the functionals Ch^ are convex and lower scmicontinuous in L^(X, m). 

Obviously Ch=^ > max{Ch;jJ", Ch~} and Ch;^(/) = Ch^(— /). Lemma 4.3 still holds with the 
same proof and, using (2.9), locality can be proved for the one-sided relaxed gradients as well, 
so that |V^/|* < |V^/| m-a.e. for / Borel and d-Lipschitz. We shall see in the next section 
that if m satisfies (4.2) then for every Borel function |V^/|* = |V/|* and Ch* = Ch^ = Ch~. 
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5 Weak upper gradients 



In this subsection we define a new notion for the "weak norm of the gradient" (which we will 
call "minimal weak upper gradient" ) of a real valued functions / on an extended metric space 
{X, d) and we will show that this new notion essentially coincides with the relaxed gradient. 
The approach that we use here is inspired by the work [34], i.e. rather than proceeding by 
relaxation, as we did for |V/|*, we ask the fundamental theorem of calculus to hold along 
"most" absolutely continuous curves, in a sense that we will specify soon. Our definition of 
null set of curves is different from [34], natural in the context of optimal transportation, and 
leads to an a priori larger class of null sets, see Remark 5.3; also, another difference is that 
we consider Sobolev regularity (and not absolute continuity) along every curve, so that our 
theory does not depend on the choice of precise representatives in the Lebesgue equivalence 
class. In Remark 5.10 we compare more closely the two approaches and show, as a nontrivial 
consequence of our identification results, that they lead to the same Sobolev space. 

The advantages of working with a direct definition, rather than proceeding by relaxation, 
can be appreciated by looking at Lemma 5.15, where we prove absolute continuity of func- 
tionals t '-^ Jx ^ift) dtn even along curves t ftxn that are absolutely continuous in the 
Wasserstein sense, compare with Proposition 4.22 for L^-gradient flows; we can also compute 
the minimal weak upper gradient for Kantorovich potentials, as we will see in Section 10. 

We assume in this section that {X, r, d) is an extended Polish space and that m is a cr-finite 
Borel measure in X representable in the form e^ rh with m{X) < 1 and V : X ^ [0, oo) Borel 
and d-Lipschitz. Recall that the p-energy of an absolutely continuous curve has been defined 
in (2.2), as well as the collection of curves of finite p-energy AC^((0, 1); (X, d)), which we will 
consider as a Borel subset of C([0, 1];X) (and in particular a Borel subset of a Polish space). 

5.1 Test plans, Sobolev functions along a.c. curves, and weaik; upper gra- 
dients 

Recall that the evaluation maps e^ : C([0, 1];X) — ^ X are defined by e((7) := 7^. We also 
introduce the restriction maps restrf : C([0, 1];^) — )• ^([0, 1];^), < t < s < 1, given by 

so that restr| restricts the curve 7 to the interval [t, s] and then "stretches" it on the whole 
of [0,1]. 

Definition 5.1 (Test plans) We say that a probability measure tt G ^{C{[0,1];X)) is a 
test plan if it is concentrated on AC((0, 1); {X,6)), i.e. 7r(C([0, 1];X)\AC((0, 1); {X,6))) = 0, 
and 

(et)j7r<m for all te[0,l]. (5.2) 
A collection 7 of test plans is stretchable if 

TT G T =^ (restrf )(j7r G 7 for every 0<t<s<l. (5.3) 

We will often impose additional quantitative assumptions on test plans, besides (5.2). The 
most important one, which we call bounded compression, provides a locally uniform upper 
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bound on the densities of {et)f,n. More precisely, a test plan tt has bounded compression on 
the sublevels of V if for all M > there exists C = C(7r, M) G [0, oo) satisfying 

(et)j7r(S n{V < M}) < C{-!t, M) m{B) \/B e ^{X), t G [0, 1]. (5.4) 

The above condition (5.4) depends not only on m, but also on V. For finite measures m it 
will be understood that we take V equal to a constant, so that (5.4) does not depend on the 
value of the constant. 

Taking (5.4) into account, typical examples of strctchablc collections T arc the families of 
all the test plans with bounded compression which are concentrated on absolutely continuous 
curves, or on the curves of finite 2-energy, or on the geodesies in X. 

Definition 5.2 (Negligible sets of curves) Let 7 be a stretchable collection of test plans. 
We say that a set N C AC((0, 1); (X,d)) is 7-negligible provided 7t(N) = for any test plan 
TT e7. A property which holds for every curve of some Borel set A C AC((0, 1); (X, d)) except 
possibly for a negligible subset of A is said to hold for 7-almost all curves in A. 

When T is a stretchable collection of test plans, we say that / : X — >■ M is Sobolev along 
T-almost all curves if, for T-almost all absolutely continuous curves 7, / o 7 coincides a.e. in 
[0, 1] and in {0, 1} with an absolutely continuous map /y : [0, 1] — t- M. 

In the next remark we compare our definition with the more classical notion of Mod2-null 
set of absolutely continuous curve used in [34] . 

Remark 5.3 Recall that, for a collection T of absolutely continuous curves in {X,d), the 
2-modulus Mod2(r) is defined by 



Mod2(r) 



:= inf g'^dm: g>0 Borel, J 5 > 1 for all 7 G r| 



If 7 denotes the class of plans with bounded compression defined by (5.4), it is not difficult 
to show that Borel and Mod2-null sets of curves are T-ncgligible. Indeed, if tt € T has (with 
no loss of generality) finite 2-action and is concentrated on curves contained in < M} and 
J^g >. ^ for all 7 G r, we can integrate w.r.t. tt and then minimize w.r.t. g to get 

[7r(r)]^ < C(7r,M)Mod2(r) J |7pdsd7r(7). 

Proving equivalence of the two concepts seems to be difficult, also because one notion is 
independent of parameterization, while the other one (because of the bounded compression 
condition) takes into account also the way curves are parameterized. ■ 

Definition 5.4 (Weak upper gradients) Let 7 be a stretchable collection of test plans. 
Given / : X — )• M Sobolev along 7-almost all curves, a m-measurable function G : X — )■ [0, 00] 
is a 7 -weak upper gradient of f (or a weak upper gradient w.r.t. 7) if 

G for 7-almost all 7 G AC((0, 1); (X,d)). (5.5) 



// 
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Notice that the measurabihty of s i— )• G(7s) in [0, 1] for T-almost every 7 is a direct consequence 
of the m-mcasurability of G: indeed, if G is a Borel modification of G, A D {G ^ G} is a 
m-neghgible Borel set and tt is a test plan we have by (5.2) that 7r({7j G A}) = {qi)^tt{A) = 
for every t G [0, 1], so that 

= ^ T^{{lt^A})dt = J X{^,6A}d7r(7)dt = J X{^,eA}dt)d7r(7). 

So ^{"rt^A} di is therefore null for vr-a.e. 7. For any curve 7 for which the integral is null G{'yt) 
coincides a.e. in [0, 1] with the Borel map G(7t). 

Remark 5.5 (Slopes of d-Lipschitz functions are weak upper gradients) As wc ex- 
plained in Remark 2.6, if / : X — )• R is Borel and d-Lipschitz, then the local Lipschitz 
constant |V/| and the one-sided slopes are upper gradients. Therefore they are also weak 
upper gradients w.r.t. any stretchable collection of test plans sense we just defined. Notice 
that the m-measurability of the slopes is ensured by Lemma 2.4. ■ 

Remark 5.6 (Restriction and equivalent formulation) If G is a T-weak upper gradient 
of /, then the strechable condition (5.3) yields for every t < s in [0, 1] 

1/(7.) - /(7i)l < G(7,)|7,| dr for T-almost ah 7. 

It follows that for T-almost all 7 the function fj satisfies 

- Mt)\ < £ G{^r)\ir\ dr for alH < s G Q n [0, 1]. 

Since /-y is continuous the same holds for alH < s in [0, 1], so that we obtain an equivalent 
pointwise formulation of (5.5): 



<Go 7I7I a.e. in [0, 1], for T-almost ah 7 G AC((0, 1); {X, d)). (5.6) 



5.2 Calculus with weeik upper gradients 

Proposition 5.7 (Locality) Let 7 be a strechable collection of test plans, let f : X W 
be Sobolev along 7 -almost all absolutely continuous curves, and let Gi, G2 be weak upper 
gradients of f w.r.t. 7. Then min{Gi,G2} is a 7-weak upper gradient of f . 

Proof. It is a direct consequence of (5.6). □ 

The notion of weak upper gradient enjoys natural invariance properties with respect to m- 
negligible sets: 

Proposition 5.8 (Invariance under modifications in m-negligible sets) Let 7 be a 

stretchable collection of test plans, let f, f : X ^ M. and G, G : X ^ [0, 00] be such that both 
{/ 7^ /} fl"^^ {G 7^ G} are m-negligible. Assume that f is Sobolev along 7-almost all curves 
and that G is a 7-weak upper gradient of f. Then f is Sobolev along 7-almost all curves and 
G is a 7-weak upper gradient of f 
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Proof. Fix a test plan tt: it is sufHcient to prove that the sets {7 : /(7t) 7^ filt)}, * = 0, 1, 

the set {7 : J^G ^ Xy ^} ^^'^ {7 ■ -^^({^ • /(7s) 7^ /(7s)}) > O} are contained in 

TT-neghgible Borel sets. 

For the first two sets the proof is obvious, because {et)^Tr <C m, which imphes that if A 
is a m-neghgible Borel set containing {/ 7^ /} we have 7r({7 : 7^ G A}) = {et)^Tr{A) = 0. 
For the third one we choose as A a m- negligible Borel set containing {G ^ G} and wc use 
the argument described immediately after Definition 5.4. For the fourth one we choose a 
m-negligible Borel set A containing {/ 7^ /} and argue as for the third. □ 

Thanks to the previous proposition we can also consider extended real valued / (as Kan- 
torovich potentials), provided the set N = {\f \ = 00} is m-negligible: as a matter of fact the 
curves 7 which intersect N&tt = OoTt = l are negligible, hence Jq^ f is defined for almost 
every 7. 

Definition 5.9 (Minimal weak upper gradient) Let 7 be a stretchable collection of test 

plans and let f : X ^ M be Sobolev along 7-almost all absolutely continuous curves. The 
l-minimal weak upper gradient [\/ f\w,7 of f is the weak upper gradient characterized, up to 
m-negligible sets, by the property 

\^f\w,7 < G m-a.e. in X, for every 7-weak upper gradient G of f. (5.7) 

Uniqueness of the minimal weak upper gradient is obvious. For existence, we take |V/|^^7 := 
inf„ Gn, where Gn are weak upper gradients which provide a minimizing sequence in 

inf I J tan~^Gdm° : G is a T-weak upper gradient of /| . 

We immediately see, thanks to Proposition 5.7, that we can assume with no loss of generality 
that Gn+i < Gn- Hence, by monotone convergence, the function |V/|tu,7 is a weak upper 
gradient of / on yi and Jj,^ tan~^Gdm'' is minimal at G = |V/|^^t. This minimality, in 
conjunction with Proposition 5.7, gives (5.7). 

Remark 5.10 (Comparison with Newtonian spaces) Shanmugalingam introduced in [34] 
the Newtonian space N^''^{X,d,m) of all functions f : X ^ R such that / /^ dm < 00 and 
the inequality 

1/(71) -/(7o)|< / G (5.8) 

holds out of a Mod2-nun set of curves, for some G G L'^{X,m). Then, she defined |V/|s as 
the function G in (5.8) with smallest norm and proved [34, Proposition 3.1] that functions 
in N^'^{X,d,m) are absolutely continuous along Mod2-almost every curve. 

Remarkably, Shanmugalingam proved (the proofs in [34] work, with no change, even 
in the case of extended metric measure spaces) this connection between Newtonian spaces 
and Cheeger's functional Ch^ described in Remark 4.6: / G D(Ch^) if and only if there is 
/ G A^^'^(X, d, m) in the Lebesgue equivalence class of /, and the two notions of gradient 
jV/jg and jV/jc coincide m-a.e. in X. 

Now, the inclusion between null sets provided by Remark 5.3 shows that the situation 
described in Remark 4.6 is reversed. Indeed, while |V/|c < |V/|*, the gradient jV/jg is 
larger m-a.e. than |V/|^^a") so that 

|V/|^,T < IV/I5 = |V/|c < |V/|. m-a.e. in X. 
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Although we are not presently able to reverse the inclusion between null sets, a nontrivial 
consequence of our identification of |V/|^^t and |V/|*, proved in the next section, is that all 
these gradients coincide m-a.e. in X. 

Since D{C\\^) C D{On^), a byproduct of the absolute continuity of functions in Newtonian 
spaces, that however will not play a role in our paper, is that functions in D(Ch*) have a 
version which is absolutely continuous along Mod2-a.e. curve. ■ 

Remark 5.11 Notice that the notion of weak gradient do depend on the class T of test plans 
(which, in turn, might depend on V). 

If Ti C T2 are stretchable collections of test plans and a function / : X — t- M is Sobolev 
along T2-almost all absolutely continuous curves, then / is Sobolev along Ti-almost all abso- 
lutely continuous curves and 

|V/U,Ti < |V/U,T,. (5.9) 

Thus larger classes of test plans induce smaller classes of weak upper gradients, hence larger 
minimal weak upper gradients. ■ 

Another important property of weak upper gradients is their stability w.r.t. LP conver- 
gence: we state it for all the stretchable classes of test plans satisfying a condition weaker 
than bounded compression, inspired to the "democratic" condition introduced by [27]. 

Theorem 5.12 (Stability w.r.t. m-a.e. convergence) Let us suppose that 7 is a stretch- 
able collection of test plans concentrated on AC((0, 1); {X,d)) for some p G (l,oo] such that 
for all TV & 7 and all M > there exists C = C(7r, M) G [0, 00) satisfying 

f {et)iTv{Bn{V < M})dt<C{7v,M)m{B) G ^(X). (5.10) 

Jo 

Assume that fn are m-measurable, Sobolev along 7-almost all curves and that Gn are 7-weak 
upper gradients of fn- Assume furthermore that fn{x) — )■ f{x) G M for m-a.e. x E X and 
that {Gn) weakly converges to G in L'^{{V < M},m) for all M > 0, where q G [l,oo) is the 
conjugate exponent of p. Then G is a 7-weak upper gradient of f. 

Proof. Fix a test plan tt and assume with no loss of generality that £^[7] < L < 00 7r-a.e. 
(recall (2.2)). By Mazur's theorem we can find convex combinations 

Nh+i Nh+i 

LLfi ■= QfjGi with ccj > 0, aj = 1, Nh — >■ 00 

i=Nh+l i=Nh+l 

converging strongly to G in L'<'{X,{V < M}). Denoting by /„ the corresponding convex 
combinations of /„, iJ„ are weak upper gradients of /„ and still fn^f ta-a.e. in {V < M}. 
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Since for every nonnegative Borel function : X — >■ [0, oo] and any integer M it holds 
{■withC = C{iv,M)) 



I ^ I n{v<M} ^ / ^{v<M}{lt)'P{lt)\it\ di) dTT 

< / {l'x^v<M}{it)'P'{it)dty^\i^'\jt\^dty^''dn 

<{f I ^^d(e,)„7rdt)'''7 / £,[7]d7r)'^'' 

^JO J{V<M} ^ ^ 

<(C [ ^1 dm) ( / fipW dTr) (5.11) 

^ J{V<M} ^ ' 

we obtain, for C := C^/^L^p 

lil |i/„-G|+min{|/„-/|,l})d7r 
J \^7n{y<M} / 

< ci^Hn — G\Li{\V<M},xa) + II min{|/n - /I, l}||L9({y<M},m)) ^ 0- 

By a diagonal argument we can find a subsequence n{k) independent of M G N such that 
— G|+min{|/„(fc) — /1, 1} ^ as ^ oo for 7r-a.e. 7 contained in {V < M}, and thus 

for TT-a.e. 7. Since fn converge m-a.e. to / and the marginals of tt are absolutely continuous 
w.r.t. m we have also that for 7r-a.e. 7 it holds fnij'o) /(70) and fnili) ~> fill)- 

If we fix a curve 7 satisfying these convergence properties, since {fn{k))'y ^ire equi-absolutely 
continuous (being their derivatives bounded by -ffn(A;)°7l7l) ^"^d a further subsequence of Jn{k) 
converges a.e. in [0, 1] and in {0, 1} to fi'js), we can pass to the limit we obtain an absolutely 
continuous function /y equal to /(7s) a-e. in [0, 1] and in {0, 1} with derivative bounded by 
G(7s)|7s|. Since tt is arbitrary in T we conclude that / is Sobolev along T-almost all curves 
and that G is a T-weak upper gradient of /. □ 



Corollary 5.13 Let 7 be a stretchable collection of test plans satisfying (5.10) and concen- 
trated on AC^((0, 1); (X, d)), and let Ch be defined as in (4.15). If f & D{Ch) then f is 
Sobolev along l-almost all curves and \V f\w,7 < 1^/1* vci-a.e. in X. 

Proof. By the very definition of Ch and the chain rule for relaxed gradients it is sufficient to 
consider the case when / is bounded. We already observed in Remark 5.5 that, for a Borel 
d-Lipschitz function /, the local Lipschitz constant is a T-weak upper gradient. Now, pick 
a sequence (/„) of Borel d-Lipschitz functions converging to / in L^(X, m) such that |V/n| 
converge weakly in L^(X, m) to |V/|*, thus in particular weakly in L?'{{V < Af},m) to |V/|* 
for all M > 0. Then, Theorem 5.12 ensures that |V/|* is a T-weak upper gradient for /. 

□ 

We shall also need chain rules for minimal weak upper gradients; the proofs are very 
analogous to those of relaxed gradients, so we omit a few details. 

Proposition 5.14 (Chain rule for minimal weak upper gradients) Let7 be as in The- 
orem 5.12. If f : X ^ M. is Sobolev along 7 -almost all curves, the following properties hold: 
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(a) for any -negligible Borel set N it holds \Vf\w,7 = m-a.e. on f ^{N); 

(b) \V(p{f)\w,7 = </>'(/) I with the convention • oo = 0, for any nondecreasing func- 
tion (j), locally Lipschitz on an interval containing the image of f. 

Proof. First we prove (b) in the case when (f) is everywhere difFerentiable. By the same 
minimahty argument in Proposition 4.8(d), it suffices to show that \V(l){f)\w,7 < 4''{f)\^ f\w,7- 
This inequahty is a direct consequence of (5.6): indeed, if /-y is the absolutely continuous 
function equal to / o 7 a.e. in [0, 1] and in {0, 1}, we have (j) ° f-y = o / on {0, 1} and 
\{(t>°fi)'\ = 4>'{f-y)\fj\ < (t>'{fi)\^f\w,7°7\i\- Since = / 07 a.e. in [0, 1], by integration we 
get that 4''if)\'^ f\w,7 is a weak upper gradient. 

Having estabilished the chain rule when (p is differentiable, the proof of (a) follows by 
the stability of weak gradients, as in the proof of Proposition 4.8(a). Eventually we extend 
(b), which now makes sense defining arbitrarily (f)'{f) at points where x such that cf) is not 
differentiable at f{x), by a further approximation, as in Proposition 4.8. □ 



Lemma 5.15 Let 7 be the collection of all the test plans concentrated on AC^((0, 1); {X,d)) 
with bounded compression on the sublevels ofV (i.e. satisfying (5.4) J. 

Let jjL G AC^((0, r); (^(X), 1^2)) be an absolutely continuous curve with uniformly bounded 
densities ft = d/U^/dm. Let <j) : [0, 00) — t- M 6e a convex function with (/>(0) = and (j)' locally 
Lipschitz in (0, 00). We suppose that for a.e. t & (0, T) ft is Sobolev along 7 -almost all curves 

:= [ \Vft\l,7^^<oo, Gl:= [ (<^"(/i)|VAU,T) '/t dm < 00, (5.12) 

for a.e. t G (0, T). Assume in addition that G, H e L^{0,T) and that J-^ \(f){fo)\ dm < 00. 
Then t J-^ l<?^(/t)l dm is bounded in [0, T], 

$t := / ^(/t)dm is absolutely continuous in \fi.,T\ and -r'^t < G^i |At| ^-C- in{^,T). 
Jx di 

(5.13) 

// moreover cj)' is Lipschitz on an interval containing the image of ft, t G [0,1"], then the 
pointwise estimates hold 

limsup <GtIimsup + |Ar-|dr, liminf <GtIiminf + |Ar|dr. (5.14) 

sit S — t sit Jt «4-t S — t sit Jt 

Proof. It is not restrictive to assume T = 1. Let C be a constant satisfying ^t < Cm 
for all t G [0, 1] and notice that, by interpolation, ft are uniformly bounded in all spaces 
LP{X,m). In addition, since fs weakly converge to ft as s ^ t in the duality with Ch{X), 
and Ch{X) n L/'{X,m), 1 < p < 00, is dense in L/'{X,m) (thanks to the existence of the 
d-Lipschitz weight function V whose sublevels have finite m measure) , we obtain that t^ ft 
is continuous in the weak topology of L^(X, m) (weak* if g = 00), with q dual exponent of p. 
It follows that <E>t is lower scmicontinuous. Arguing as in [23, 2], sec also the work in progress 
[24] for the case of extended metric spaces, we can find tt G ^{C{[0, l];-'^)) concentrated in 
AC2((0, 1); {X, d)) and satisfying 



l^t 



(et)a7r for every t G [0, 1], = J Ijtl^ diT{j) for a.e. t G (0, 1), (5.15) 
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so that 7r G T. Let us first suppose that (j)' is locally Lipschitz continuous in [0, oo), so that 
$t is everywhere finite. Possibly replacing ^{z) by (l){z) — (f)'{0)z we can assume that (f) is 
nonnegative and nondecreasing. 

We pick a point t such that ft is Sobolev along 7r-almost all curves and Ht < oo and we 
set ht := 4)' {ft), 9t ■= |V/ttU,T = (l)"{ft)\V ft\w,7- Then for every s G (0,i) we have 

^t-^s< j^(k'{ft){ft-fs) dm = J {ht{jt))-ht{is))d7r{j)< j j\t{lr)\ir\drdTT{^) 

Since G iv^(0, 1) we deduce from Lemma 2.7 (with tt; = — L = = C^/'^H at all 

points t such that /< is Sobolev along 7r-almost all curves, g = +oo elsewhere) that $ is 
absolutely continuous. 

Writing the inequalities analogous to (5.16) for s > t, dividing by s — i, and passing to 
the limit as s 1 thanks to the w* continuity of r i-> in L°°{X, m) we get the bound (5.14) 
(and thus (5.13) when t is also a differentiability point for $ and a Lebesgue point for |/i| 

When (f) is an arbitrary convex function, for e G (0, 1] we set 



Mr) 



r0'(e) if < r < £, 

(p{z) - ^(e) + e(f)'{e) if r > e; 



it is easy to check that (pe is convex, with locally Lipschitz derivative in [0, oo) and that (pe i 
as £ I 0, since e £^'(£) — 4>{£) is increasing and converges to as £ J, 0. Notice moreover 
that (^e)" < (f>". Applying the integral form of (5.13) to $f := ^^(/t) dm we get 

l^t - < ^ Gr lAr-l dr for every < s < i < 1. (5.17) 

Since $q — )■ $0) it follows that all the functions are uniformly bounded. In addition, (5.16) 
with 4> = (pe gives that 

! ((/.,)-(/,) dm < / {M+{fs)dm + R< I {ct>i)+ if s) dm + R 
Jx Jx Jx 

with R uniformly bounded in s and £ (notice that t can be chosen independently of s and 
e). Hence, applying the monotone convergence theorem we obtain the uniform bound on 
ll'?^(/t)llLi(x,m) ^^d pass to the limit in (5.17) as £ 4 0, obtaining (5.13). □ 



Remark 5.16 (Invariance properties) If T is the collection of all test plans concentrated 
on AC^((0, 1); (X, d)) with bounded compression on the sublevels of V (according to (5.4)) all 
the concepts introduced so far (test plans, negligible sets of curves, weak upper gradient and 
minimal weak upper gradient) are immediately seen to be invariant if one replaces m with the 
finite measure rfi := e~^ m (recall (4.2)): indeed, any test plan with bounded compression 
relative to m is a test plan with bounded compression relative to m and any test plan bounded 
compression relative to m can be monotonically approximated by analogous test plans relative 
to m. A similar argument holds for plans satisfying (5.10). ■ 
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Remark 5.17 As for Cheeger's energy and the relaxed gradient, if no additional assumption 
on (X, r, d,m) is made, it is well possible that the weak upper gradient is trivial. 

This is the case of the second example considered in Remark 4.12, where it is easy to check 
that the class of absolutely continuous curves contains just the constants, so that | V/|^,a- = 
for every / G L^([0, l];m) independently from the choice of T. In order to exclude such 
situations, we are going to make additional assumptions on (X, r, d,m) in the next sections, 
as the lower semicontinuity of |V~Entnip(/Tn): this ensures, as we will see in Theorem 7.6, 
its agreement with 8Ch*(v^). Since Ent^ is not trivial, the same is true for iV^Ent^l and 
for Ch*. In turn, we will see that lower semicontinuity of |V~EntOTp is implied by CD{K, oo). 



6 Identification between relaxed gradient and weak upper gra- 
dient 

The key statement that will enable us to prove the main identification result of this section 
is provided in the following lemma. It corresponds precisely to [17, Proposition 3.7]: the 
main improvement here is the use of the refined analysis of the Hamilton-Jacobi equation 
semigroup we did in Section 3, together with the use of relaxed gradients, in place of the 
standard Sobolev spaces in Alexandrov spaces. In this way we can also avoid any lower 
curvature bound on {X, d) and we do not even require that {X, d) is a length space. 

Lemma 6.1 (A key estimate for the Wasserstein velocity) Let (X, r, d,m) be a Pol- 
ish extended measure space satisfying 

m({a; G X : (i{x,K) < r}) < oo for every compact K d X and r > 0. (6-1) 

Let {ft) he the gradient flow o/ Ch* in L^(X, m) starting from a nonnegative /o G L^(X, m) 
and let us assume that 



L 



, /tdm = l, / / ^^^dmds<oo for every t>(}. (6.2) 

X Jo J{fs>0} fs 

Then, setting Ht := /tm G ^{X), the curve t ^ jit := ftXti is locally absolutely continuous 
from (0, oo) to (,^[^q] (X), W2) and its metric speed satisfies 

\fitf < f ^^^dm /or a. e. iG (0,00). (6.3) 



'{/t>0} ft 

Proof. We start from the duality formula (2.21): it is easy to check that it can be written as 



sup / Qiipdu— / ipdfi (6-4) 
ifi Jx Jx 



where the supremum runs in Cf){X). Now, if /i <C m, we may cquivalently consider r- 
lower semicontinuous functions ip of the form (3.21); indeed, given 99 G Cb{X), considering a 
sequence of compact sets Kn C X whose union is of full m-measure and setting 



(Pn(,x) := 



(p{x) if X G Kn, 
sup if ii X E X \ Kn 
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we obtain (pn i f m-a.e. and Qi(pn > Qif- 

Moreover, since (6.4) is invariant by adding constants to (p, we can always assume that 
M = in (3.21) and that ip vanishes outside a compact set K. 

Now, if if is of the form (3.21) with M = 0, we notice that for all e > the map Q^ip is d- 
Lipschitz, bounded and lower semicontinuous (the latter property follows by Proposition 3.8), 

and 

Qe^pix) =0 if d(x, K) > 2y^max(-^) and £ < 2, (6.5) 

so that, by (6.1), ((5£(</'))£g[o,2] is uniformly bounded in each LP{X,m). 

In addition, since (p is r-lower semicontinuous, by Remark 3.7 we have QeP t 'Z' as e J, 0. 
Hence, by approximating ip with Q^ip and taking also the convergence of Ql-^-£^p < Qi{Qe^) 
to Qi(p as £ 4 into account (recall the continuity property (3.5) and that = oo in this 
case), we see that the supremum in (6.4) can be taken over the set of bounded (^'s with Qtip 
r-lower semicontinuous for all t > and uniformly d-Lipschitz. 

Fix such a function (p and observe that, thanks to the pointwise estimates (3.12) and 
(3.20), the map Qt^ \s Lipschitz with values in L°°(X, m), and a fortiori in L?{X,m) by 
(6.5). In addition, the "functional" derivative (i.e. the strong limit in of the difference 
quotients) dtQtP of this L^(X, m)-valued map is easily seen to coincide, for a.e. t, with the 
map ^Qt(p{x). Recall also that, still thanks to Proposition 3.8, the latter map is Borel and 
\VQtp\ is ^*{X X (0, oo))-measurable. 

Fix also < t < s, set i = {s — t) and recall that since (ft) is the gradient flow of Ch* in 
L'^{X,m), the map [0,^] B r ^ ft+r is Lipschitz with values in L^(X, m). 

Now, for a, b : [0, i] — )■ L?'{X, m) Lipschitz, it is well known that t J-^ atbt dm is Lipschitz 
in [0,i] and that (J atbt dm)' = fj^btdtat dm + J^atdth dm for a.e. t G [0,^]. Therefore we 
get ^ ^ 

dr fx '^'■/^'^ ^"^ ^ X "^^^"^ Qr/e'P Ad,m/t+r dm for a.e. r > 0, 
where ^s(x) := ^Qt^\f^g{x)', we have then: 

/ Qi(pdiJ.s- / pdiJ,t= Qipft+i dm- / (p ft dm 
Jx Jx Jx Jx 

= 11 l^r/e ft+r + Qr/e'P ^d,m ft+r dmdr 

Jo Jx ^ , s 

/■ /-^ 1 ^^-^^ 

= jirll ft+r + Qrll^ '^d,m ft+r drdm 

JxJ 2^ ^Qr/e'P^d,mft+rardm. 

In the last two steps we used first Fubini's theorem and then Theorem 3.5. Observe that by 
inequalities (4.25) and (4.8) we have (using also that IV/^I* = m-a.e. on {fg = 0}) 



J^Qr/e'P^d,mft+r(im < \VQr/e(pU |V/t+r|*dm 



< ^ / I Vg./,^| dm + ^ / dm. 

2^ Jx ^ J{ft+r>0} Jt+r 
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Plugging this inequality in (6.6) and using once more Fubini's theorem we obtain 
/ Q.^dfig- f ^d^^t<^- f [ ^^drdm. 

Jx Jx ^ Jo J{ft+r>0} h+r 

This latter bound does not depend on (p, so from (6.4) we deduce 

Wiii^t, l^s) <^ f I I^l^drdm, l = s-t. 

Jo J{ft+r>0} h+r 

By (6.2) we immediately get that € {X) and (6.3) holds. □ 
In the next two results we will consider the class of (measurable) functions 

/ : X ^ M such that Jn ■= min{iV, max{/, -N}} G L^{X, m) for every iV > 0. (6.7) 

Theorem 6.2 (Relaxed and weak upper gradients coincide) Let (X, r, d,m) he a Pol- 
ish extended measure space with m satisfying (4.2) and let 7 he the collection of all test plans 
concentrated on AC^((0, 1); (X, d)) with bounded compression on the sublevels ofV (5.4). 
A measurable function f : X M. satisfying (6.7) has relaxed gradient |V/|* (according to 
(4.14)j in L'^{X,m) iff f is Sobolev on 7-almost all curves and |V/|^,o- G L?'{X,m). In this 
case 

|V/U = |V/U,T m-a.e. inX. (6.8) 

Proof. Taking into account Remark 5.16 and Lemma 4.11 It is not restrictive to assume that 
m G ^(X), so that we can choose V =\. Moreover, we can assume that < < / < 
M < oo m-almost everywhere in X with dm = 1. By Corollary 5.13 we have to prove 

that if / is Sobolev on T-almost all curves with |V/|^,t G I?{X^m) then 

Ch,(/)<1 / |V/|^,Tdm. (6.9) 

We consider the gradient flow (ht) of the Cheeger's energy with initial datum h := f^, setting 
fit = htxn, and we apply Lemma 6.1. U g = h~^\Vh\yj^-j, we easily get arguing as in (5.16) 
and using inequality (6.3) 

1/2 



J {hlogh -htlog ht) dm < (^J^ J g^hsdmds^'^'^ j \iJis? ds^ 

< \ I I g'^hsdmds+ \ I \iis\^ds< \ [ [ g^hsdmds + l- I I ^^^dmds. 

Jo Jx ^ Jo ^ Jo Jx ^ Jo J[hs>0\ 



Recalling the entropy dissipation formula (4.54) we obtain 

|2 /.t 

I V lis 

'{?is>0} 



/ / ^^^dmds< / f g'^hs dm ds. 
Jo J(hs>0} hs Jo Jx 



Now, (4.16) and the identity g = 2/-1|V/U,t give £ Ch,(V^) ds < £ I V/l^ ^^/-^/i, dmd.' 
so that dividing by t and passing to the limit as i | we get (6.9), since ^/h^ are equibounded 
and converge strongly to / in L^{X,m) as s 4 0. □ 
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Corollary 6.3 Let 7i be the collection of all test plans concentrated on AC^((0, 1); (X, d)) 
with bounded com^pression on the sublevels of V as in (5.4), and let T2 be the collection 
of all test plans concentrated on AC^((0, 1); (X, d)) satisfying (5.10). Let us suppose that 
a measurable function / : X — >■ R satisfying (6.7) is Sobolev on 7i-almost all curves with 
\^f\w,7i € L'^{X,m). Then f is Sobolev on 72-almost all curves and 

|V/U,T, = |V/U,T. = |V/U m-a.e. in X. (6.10) 
Proof. Applying Theorem 6.2 and Corollary 5.13 we prove that / is Sobolev on T2-alniost all 
curves with |V/|^,t^ > |V/|„,,t2- "^^^ converse inequality follows by Remark 5.11. □ 

Remark 6.4 (One-sided relaxed gradients) Theorem 6.2 shows that the one-sided Cheeger's 
functional Ch^(/) (and the corresponding relaxed gradients |V^/|) introduced in Remark 4.23 

coincide with Ch*(/) (resp. with |V/|*). In fact, we already observed that Ch*(/) > Ch^(/). 
On the other hand, since |V^/| are weak upper gradients for Borel d-Lipschitz functions, 
argumg as m Corollary 5.13 we get |V/U,t < |V=^/|*, if / G D{Cht) and T is the class of 
all test plans concentrated on AC^((0, 1); {X,d)) with boundedcompression on the sublevels 
of V. Corollary 6.3 yields 

Ch,(/) = Chtif), |V/|* = |V±/|* m-a.e. in X. 



7 Relative entropy, Wasserstein slope, and Fisher information 

In this section we assume that (X, r, d) is a Polish extended space equipped with a cr-finite 
Borel reference measure m such that m := e~^ m has total mass less than 1 for some Borel 
and d-Lipschitz V : X ^ [0,oo). We shall work in the subspace 

^y(X) := 1^ G ^(X) : J V'^dfKooY (7.1) 

We say that (pn) C ^i/(X) weakly converges with moments to n £ ^y(X) if ;U„ — t- ;U 
weakly in J3^{X) and d//„ — )• j-^V^ d/j,. Analogously we define strong convergence with 

moments in ^{X), by requiring that — fi\{X) — >■ 0, instead of the weak convergence. 
Since for every fj, G (X), ly G ^{X) with W2{n, v) < 00 we have ly G ^v{X) and 

(^J^V^di^y^^ <Up{V)W2{i^,l^) + (^J^V^d^iy^\ (7.2) 

we obtain that weak convergence with moments is implied by W2 convergence. When the 
topology r is induced by the distance d and V{x) := Ad{x,XQ) for some A> and xq G X, 
then weak convergence with moments is in fact equivalent to W2 convergence. When m(X) < 
00 we may take V equal to a constant, so that ^{X) = ,^y(X) and weak convergence with 
moments reduces to weak convergence. 

7.1 Relative entropy 

Definition 7.1 (Relative entropy) The relative entropy functional Entm : ^(X) (— C!0,-)-oo] 
is defined as 



I -|-oo otherwise. 
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Notice that, according to our definition, /x G D(Entm) impUes f^V'^d^ < oo, and that 
i:»(Ent m) is convex. Strictly speaking the notation Entm is a slight abuse, since the functional 
depends also on the choice of V, which is not canonically induced by m (not even in Euclidean 
spaces endowed with the Lebesgue measure). It is tacitly understood that we take V equal to 
a constant whenever m{X) < oo and in this case Ent^^ is independent of the chosen constant. 

When m G ,^{X) the functional Entm is sequentially lower semicontinuous w.r.t. weak 
convergence in ^(X). In addition, it is nonnegative, thanks to Jensen's inequality. More 
generally, if m is a finite measure and m := m(X)~^m, 

Ent„(/i) = Entnv(/x) - log(m(X)) > - log(m(X)) for every n € ^(X), (7.3) 

and we have the general inequality (see for instance [4, Lemma 9.4.5]) 

Ent7rjm(7'"tlM) ^ Entfn(//) for every fi G ^(X) and tt : X ^ Y Borel map, (7.4) 

which turns out to be an equality if tt is injective. When m{X) = oo, since the density p 
of /Li w.r.t. tfi equals pe^ we obtain that the negative part of plogp is L^(m)-integrable for 
H G ^v{X), so that Definition 7.1 is well posed and Ent^ does not attain the value — oo. We 
also obtain the useful formula 

Ent„i(//) = EntA(//) - [ dp V// G ^v{X). (7.5) 
Jx 

The same formula shows Entm is sequentially lower semicontinuous in ^v{X) w.r.t. conver- 
gence with moments, i.e. 

Pn^pm^{X), / V^dpn^ / V'^dpKoo =^ liminf Ent^ (//„)> Entni(At). (7.6) 
Jx Jx 

From (7.2) we also get 

He^v{X), W2{pn,t^)^0 =^ liminfEnt„;(/x„) >Ent„;(//). (7.7) 

n— >oo 

The following lemma for the change of reference measure in the entropy, related to (7.5), will 
be useful. 

Lemma 7.2 (Change of reference measure in the entropy) Let u G ^P{X) and the 
positive finite measure n be satisfying Entn(7v) < oo. If v = gm for some a -finite Borel 
measure m, then glogg G L^{X,m) if and only i/ log(dn/dm) G L^{X,u) and 

EntM = Entn(i^) + log (^) du. (7.8) 

Proof Write v = fn and let n = hm + be the Radon-Nikodym decomposition of n w.r.t. 
m. Since gm = v = fhm + /n^ we obtain that g = fh m-a.e. in X and / = n*-a.e. in X. 
Since / log / G L^{X,n) we obtain that X{/i>o}5 log((7//i) belongs to L^(X, m), so that (taking 
into account that { 5 > 0} C {/i > 0} up to m-negligible sets) glogg G {X, m) if and only 
if g log h G L^{X, m). The latter property is equivalent to log h G L^{X, v). □ 
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Remark 7.3 (Tightness of sublevels of Entni(At) and setwise convergence) We remark 
that the sublevels of the relative entropy functional are tight if Tn(X) < oo. Indeed, by Ulam's 
theorem m is tight. Then, using first the inequality zlog{z) > — and then Jensen's in- 
equality, for fi = pm we get 

:^m+c>;!^(^+EnUM)> /plogpdm>,(£)log('4f^) (7.9) 

e e Je \ME)J 

whenever E € ,^{X) and Ent^{p) < C. This shows that /x(-B) ^ as m(£^) uniformly 
in the set {Ent^ < C}. 

In general, when e"^ dm < 1, we see that (7.5) yields 

|/x G ^{X) : J V^dn + Entm(/x) < c} is tight in ^{X) for every C G R. (7.10) 

Moreover, if a sequence (pn) belongs to a sublevcl (7.10) and weakly converges to /x, then the 
sequence of the corresponding densities pn = converges to /? = weakly in L^{X,m): 
since L°°{X,m) = L°°{X,m), it is sufficient to recall (7.3) and to apply de la Vallee Puissen's 
criterion for uniform integrability [7, §4.5.10] to the densities of /i„ w.r.t. the finite measure 
m = e~^ m. In particular the sequence (//„) setwise converges to /i, i.e. iJLn{B) — > p{B) for 
every B G S§{X). ■ 



7.2 Entropy dissipation, slope and Fisher information 

In this subsection we collect some general properties of the relative entropy, its Wasserstein 
slope and the Fisher information functional defined via the relaxed gradient that we intro- 
duced in the previous section. 

We will always assume that m satisfies condition (4.2), so that Theorem 4.20 will be 
applicable. 

Theorem 7.4 Let p = pm. £ D(Entm) with \V~Entm\{p) < oo. Then ^ G D(Ch*) and 

4 / |Vv/^|^dm< |V-Ent„|'(/x). (7.11) 

JX 

Proof. Let us first assume that p G L'^{X, m) and let (pt) be the gradient flow of the Cheeger's 
functional starting from p; we set pt '■= Pt^ and recall the definition 4.9 of Fisher information 
functional F. Applying Proposition 4.22 and Lemma 6.1 we get 

Entm{p) - Ent„(/xt) > ^ ^ F{ps) + ^ ^ lA.P ds (7.12) 

Dividing by W2{p.,pt) and passing to the limit as t J, we get (7.11), since the lower semi- 
continuity of Cheeger's functional yields 

^/F{p)<limMl f ^/¥{^ds. 
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In the general case when only the integrability conditions p log p dm < oo and /-^ V'^p dm < 
oo are available, we can still prove (7.12) by approximation. Wc set := z,,, min{p, n}, with 
z„ t 1 normalizing constants, and denote by p" the gradient flows of Chccgcr's energy starting 
from p". Since Proposition 4.16(b) provides the monotonicity property < m-a.e. in 
X for n < m, we can define pt := sup„ p^. Since pf dm = Zn it is immediate to check 
that p^m G £P{X) and a simple monotonicity argument based on the apriori estimate (4.46) 
guaranteed by Theorem 4.20 also gives that pt '■= Pt^ £ ^vi^) and that 2;,7^p"m converge 
with moments to pf It is then easy to pass to the limit in (7.12), using the sequential lower 
semicontinuity of entropy with respect to convergence with moments, to get 

Entmip) - Ent„(/^t) ^\{J^ ^/F(Js) ds)W2{p, pt) 

and then conclude as before. □ 

Theorem 7.5 Let p = pm G D(Entm). Assume that p = maxjpo, ce~^^^}, where c > and 
po is a d-Lipschitz and bounded map identically for V sufficiently large. 
Then _ ^ 

|V~EntTO|^(p) < / ^^-^dm = 4 / iV-^pl^dm. (7.13) 
Jx P Jx 

Proof. We set L = Lip(y), M := suppo and choose C > in such a way that po = on 
{2V'^ > C}. Possibly multiplying p and m by constants we assume Lip(po) = 1. Let us 
introduce the non negative ^*{X x X)-measurable function 



(logp(x) - logp(y))^ 



L{x,y) := < 



d{x,y) 

(7.14) 

V-p(x)| 



p{x) 



iix = y, 



and notice that for every x E X, the map y i— )■ L(x,y) is d-upper semicontinuous. 
We claim that for some constants C", C" depending only on M, c and C it holds 

L{x,y)<C' + C"{d{x,y) + V{x)), ^x, y e X. (7.15) 
To prove this, let A := {po > ce^^^^} and notice that logp > logc — 2V'^ gives 

{I log(po(x)) - log(po(y))|, iix,y€ A, 

2\V\x)-V\y)\, iix^A, (7.16) 

(log(po(x)) + 2y2(y)_iogc) + , iixeA, y^A. 

Since 2V'^ < C on ^, the function po is d-Lipschitz and bounded from below by ce~*^ on A, 
so that 

c c 

|log(po(x)) - log(po(y))| < —\po{x) - Po{y)\ < —d{x,y) Vx, y e A. (7.17) 

c c 

Also, for all x, y e X it holds 

\V\x) - V\y)\ = \Vix) - V{y)\\V{x) + V{y)\ < Ld{x,y){Ld{x,y) + 2V{x)) . (7.18) 
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Finally, let us consider the case x e A, y ^ A; since 2V'^{y) — logc < — logpo(y) and 
Po{y) > ce~'^/2 if 6{x,y) <a:= ce~^ /2, we get 

log(po(x)) + 2V\y) - log c < log(po(x)) - log(po(y)) < — v) (7.19) 

for d{x,y) < a. If, instead, d{x,y) > a we use the fact that p is bounded from above, the 
bound 2V'^{x) < C for x e A, and (7.18) to get 

\og{po{x)) + 2V\y)-\ogc = \og{po{x)) + 2V^{x)-\ogc + 2{vHy)-V^{x)) (7.20) 

< ^^^{log{M/c) + C) +2Ld{x,y){Ld{x,y) + 2V{x)). 

Inequalities (7.16), (7.17), (7.18), (7.19), (7.20) give the claim (7.15). 

Let us now consider a sequence (pnin), converging to p. in ^2{X) and such that 

V Entm\{p)= hm — ^ . 

n^oo W2{p,Pn) 

From the convexity of the function r i— > r log r we have 

EntTO(/x) - EntmifJ-n) = (plogp - pn logPn) dvx < logpip - Pn) dm 

= / \ogpdp- j log pdpn= / ( logp(x) - logp(y)) d7„ 

JX JX JXxX ^ ' 

<[ L{x,y)d{x,y)d'y„<W2{p,Pn)( [ L'^{x,y)d'y^) ^ 

JXxX ^JXxX ^ 

= W2{p, t^^^ij^ij^ -^^(^' ^) '^'yn,x) M^)) 

where 7„ is any optimal plan between p and pn and 7„ 3, is its disintegration w.r.t. its first 
marginal p. Since JxiJx ^^i^^ v) '^In^x^v)) dp{x) ^ as n ^ 00 we can assume with no loss 
of generality that 

lim / d^(a;,j/)d7 (j/) = for p-a.e. x G X, 
thus taking into account (7.15) we get 

/ L^{x, y) d'y {y) for p-a.e. x G X, 

JX\Br{x) 

for all r > 0. Taking an arbitrary radius r > we get 



lim sup / L^(x,y)d7„ ,r < limsup / -^^(a;, y) d7„ + limsup / L^(x,y)d7^ 

JX J Br{x) n->oo Jx\Br{x) 

<limsup/ L^(x,y)d7„.j. < sup L^{x,y). 

n-+oo JBrix) v&Brix) 



Since L[x, •) is d-upper semicontinuous, taking the limit as r 4- in the previous estimate we 
get limsup„ L?{x., y) d7„ 3, < -L^(x, x) for p-a.e. x ^ X. Using again (7.15), which provides 



56 



a domination from above with a strongly convergent sequence, we are entitled to use Fatou's 
lemma to obtain 

|VEnt„|(M)= lim ^ntn^M - Ent ^ f ^^^^^f f LHx,y) d^^^MV^'df^ix) 

< (^J L'^{x,x)diJ,{x)y^^. 

□ 



Theorem 7.6 Let {X,T,d,m) be a Polish extended space with m satisfying (4.2). Then 
|V~Ent^| is sequentially lower semicontinuous w.r.t. strong convergence with moments in 
J^{X) on sublevels of Ent^ if and only if 

|V-Ent„,|2(/x) = 4 f \\7^\l dm V/x = pm G £'(Ent„). (7.21) 

Jx 

In this case |V~EntOT| satisfies the following stronger lower semicontinuity property: 
Hn{B) i^{B) for every B G ^{X) ^ liminf |V"Entni|(/i„) > |V~Entni|(/x). (7.22) 

n—^oo 

Proof. If (7.21) holds then |V~Entm| coincides on its domain with a convex functional (by 
Lemma 4.10) which is lower semicontinuous with respect to strong convergence in L^{X,m): 
therefore it is also weakly lower semicontinuous and (7.22) holds [7, §4.7(v)]. In particu- 
lar |V~EntTO| is sequentially lower semicontinuous with respect to strong convergence with 
moments. 

To prove the converse implication, by Theorem 7.4 it is sufficient to prove the inequality 

|V~EntmP(/i) < 4 /^|Vy/p|^dm. Assume first that p < m-a.e. in X for some M £ 
[0, oo). Taking Theorem 7.5 into account, it suffices to find a sequence of functions pn = 
max{/^, Cme~^^ } convergent to p in L^{X,m) and satisfying: 

(a) fm is d-Lipschitz, nonnegative, bounded from above by M and null for V sufficiently 
large; 

(b) limsup„^ooi/x l^\/P^Pdm-;> Ch^^); 

(c) Ix ^'^Pn dm ^ V^pdm. 

Since the d-Lipschitz property of the wcig ht implies c"^ G W^''^{X, d, m) and / F^g-sv < 
oo, if we choose Cm > infinitesimal it suffices to find fm satisfying (a), ^ J-^ |V/r„pdm — >■ 
Ch,(Vp) and fx V^fldxn ^ V^pdm. 

To this aim, given m > 0, we fix a compact set K C X such that /^^^ pdm < {1 + m) ^ 
and a 1-Lipschitz function (j) : X ^ [0, 1] equal to 1 on if and equal to out of the 1- 
neighbourhood of K, denoted by K. Notice that m(^) < oo, since V is bounded from above 
in k. 

By approximation, for all / G D{Ch^,) we have 

|V(/(/-)U < <^|V/U + I/IIV0U "^-a.e. in X. 

Let now (gn) be a sequence of d-Lipschitz functions convergent to y/p in L^(X, m) and sat- 
isfying i Jx iV^np dm Ch*(y^); by a simple truncation argument we can assume that all 
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Qn satisfy < g'n < • The bounded, nonnegative, d-Lipschitz functions gn<j> converge in 
L^(X, m) to ^Jpcf) and, thanks to the inequahty \V(f)\^ < Xx\k Tn-a.e. in X satisfy, 

limsupCh,(c,„(/)) < (1 + — )Ch,(Vp) + (l + m) / pdm < (1 + — )Ch,(Vp) + 
n->oo m Jx\K nT- 1 + m 

In addition, g'^cp^ -> p<p'^ in L^{X,m) because the functions vanish out of K. We conclude 
that we have also 

lim / V'^gl(p'^dm= [ V^p4>'^ dm < [ V^pdm. 
^^°°Jx Jx Jx 

By a diagonal argument, choosing = gn with n = n(m) sufficiently large, the existence of 
a sequence with the stated properties is proved. 

In the case when p is not bounded we truncate p, without increasing its Cheeger's energy, 
and use once more the lower semicontinuity of |V~Entm|- D 



7.3 Convexity of the squared slope 

This subsection adapts and extends some ideas extracted from [16] to the more general frame- 
work considered in this paper. The main result of the section shows that the squared Wasser- 
stein slope of the entropy is always convex (with respect to the linear structure in the space 
of measures), independently from the identification with the Fisher information considered in 
Theorem 7.6 (the identification therein relies on the assumption that |V~Entnv| is sequentially 
lower semicontinuous with respect to strong convergence with moments). 

Let us first introduce the notion of push forward of a measure through a transport plan: 
given 7 G ^{X x X) with marginals 7* = 7r|7 and given p, G ^{X) we set 

7^ := (po 7r^)7 with p = pj^ + p' , p' ^_ 7I, 7^^ := ^2^^. (7.23) 

We recall that this construction first appeared, with a different notation, in Sturm's paper 
[35]. Notice that 7^ is a probability measure and 7r^^7ju = /Li if <C 7^; in this case, if {'^^)xex 
is the disintegration of 7 with respect to its first marginal 7^ , we have 

-r^p{B) = j^'y^{B)dp{x) for every B G ^(X). (7.24) 

Since moreover 7^^ ^ 7 we also have that '■y^p <C 7^ . 
Notice that 

1 = J_^K(x)diy{x), p<^v =^ 'y^p = r^p. (7.25) 
In the next lemma we consider the real-valued map 

p ^ G^{p) := Ent„(/x) - Ent„i(7j/x), (7.26) 

defined in the convex set 

Ry:={pe ^{X) : p < 7rj7, /^,7«/^ e £'(Ent„^)}. (7.27) 
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In the simple case when J = Jx ^r(x) dm(x) with r : X ^ X Borel bijection we may use first 
the representation formula (7.25) for -y^fi and then (7.4) to obtain 

Entm(7tj/x) = Ent„i(rti/x) = Ent^KM) 

with m' := {r~^)^m. Since jj, G Ity we have Entn^'(/x) < oo and we can use (7.8) for the change 
of reference measure in the relative entropy to get 

dm', 



Entm(/x) - Entni(7j/x) = ^i"' 



so that G-y is linear w.r.t. /U. In general, when r is not injective or r is multivalued, convexity 
persists: 

Lemma 7.7 For every 7 G ^{X x X) the map G-y in (7.26) is convex in FLy. 

Proof. Let = pixn, 112 = Pi^ £ set p, = aijii + 02^2 with ai + 02 = 1, ai, 02 £ (0, 1) 
and denote by 9i < l/ai the densities of jii w.r.t. ji. 

We apply (7.8) of Lemma 7.2 with v := p,i and n := n to get 



Ent„,(//i) = Ent^(//i) + / logpd//i. 



where p = ai/Ci + a2p2 is the density of p w.r.t. m. Taking a convex combination of the 
previous equalities for i = 1, 2, we obtain 

Q;iEntm(/xi) + a2Entm(/X2) = aiEnt;,(/xi) + a2Ent^(/X2) + Ent„i(/x). (7.28) 
Analogously, setting Vi := 7[j;^j and u := 7[j// = aii^i + 0.21^2, we have 

aiEnt,n(z^i) + a2Ent,n(i^2) = aiEnt,.(i/i) + a2Ent^{iy2) + Ent,n(z^)- (7.29) 
Combining (7.28) and (7.29) we obtain 

aiG^{pi) + a2G^{p2) = G^{p) + ^ Q;j(^Ent^(//j) - Enti,(z/i)^. (7.30) 

1=1,2 

Since Vi = 71^^(7^.) and v = 7rj2(7^), (7.4) yields 

Ent^(z/i) < Ent^^(7^.) = Ent^^(^i7^) = Ent^(/ii), 

where in the last equality we used that the first marginal of 7^ is p. Therefore (7.30) yields 
aiG^{pi) + a2G^{p2) > G^{p). □ 

Theorem 7.8 The squared descending slope jV^Entmp of the relative entropy is convex. 

Proof. Let pi, P2 & -C>(Entnv) be measures with finite descending slope and let p = aipi+a2P2 
with ai,a2 G (0, 1), ai + a2 = 1. Obviously p G I?(Entm) and since it is not restrictive to 
assume |V~Ent^|(/i) > 0, by definition of descending slope we can find a sequence (i^^) C 
D(Entni) with Entni(z^") < Entnv(/Lt) such that 

... / « N Entm(M) — Entni(i^"') ,^ ^ . 

W2iy\p) ^ 0, W2{PiV-) ^ |V-Ent^|(/.). 
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Let 7" be optimal plans with marginals /x and z/" respectively and let i^f := 7j'/Xi. Since 7^. 
are optimal plans with marginals fii and z^", we have 

Wiifii, ) = / d2(x, d7"(x, y)^0 as n ^ 00, 

where < are the densities of /Xj w.r.t. /x; in particular 

Wiifi, u^) = aiWiii^i, u^) + a2Wi{fi2, u^). (7.31) 

Since |V~Entni|(Mj) < 00, by the very definition of descending slope for every > |V~"Entni|(/Xi) 
there exists n G N satisfying 

EntTO(/Xi) - Entm(ff ) < SiW2{fii, ) for every n > n. 

By Lemma 7.7 and (7.31) we get, for n > n 

1 /2 

Ent„(/x) - Ent„^(z/") < aiSiW2(/xi, + a2S2W2(/X2, i^?) < (ai^f + a2S|) W^2(/x, z^"), 

so that, passing to the limit as n — t- 00, our choice of (z^") yields that | V~Ent,Ti|(A') does not 
exceed [a^Sf + 0282) ■ Taking the infimum with respect to Si we conclude. □ 

8 The Wasserstein gradient flow of the entropy and its iden- 
tification with the gradient fiow of Cheeger's energy 

8.1 Gradient flow of Entm: the case of bounded densities. 

In the next result we show that any Wasserstein gradient flow (recall Definition 2.11) of 
the entropy functional with uniformly bounded densities coincides with the L^-gradient flow 
of the Cheeger's functional. We prove in fact a slightly stronger result, starting from the 
energy dissipation inequality (2.25) instead of the identity (2.26), where we use the Fisher 
information functional F defined by Definition 4.9 instead of the squared slope of Entn^. Recall 
that F(/) < |V-Ent„p(/m) by Theorem 7.4. 

Theorem 8.1 Let (X, r, d,m) be an extended Polish space satisfying (4.2) and let jit = ftm E 
D{Entm), t e [0,T], be a curve in AC2((0, T); (^(X), VF2)) satisfying the Entropy- Fisher 
dissipation inequality 

Entr^{lio)>Entr^{liT) + ^J^ Ifitl^dt + ^J^ f{ft)dt. (8.1) 

// supig[o,T] ll/t llL°°(x,m) < cx) then ft coincides in [0,T] with the gradient flow Ht{fo) of 
Cheeger's energy starting from /q. 

In particular, for all /o G L°°{X,m) there exists at most one Wasserstein gradient flow 
l^t = ft^ o/Entnx in (.^[^j (X), VF2) starting from jiQ = /otn with uniformly bounded densities 
ft. 
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Proof. Let us set iij = /it, ft = ft and let us first observe that by Lemma 5.15 and (6.8) the 
curve nl satisfies 

Entmil^l) = Ent^(/^t ) + IJ^ \fil\^ds+^J^ F(/i) ds for every t € [0, T]. (8.2) 

Indeed, (5.13) and (6.8) show that the function defined by the right-hand side of (8.2) is 

nondecreasing with respect to t and coincides with Entn^(/iQ) at t = and at t = T by (8.1). 

Let = /fm, with f^ := Ht(/o), be the solution of the L^-gradient flow of the Cheeger's 
energy. Theorem 4.16 shows that ||/t^||L°°(x,m) ^ ll/o||L°°(x,m); by Lemma 6.1 and Proposi- 
tion 4.22 we get 

EntM) > EntM) + \ j lA^P ds + J / F(/2) ds for every t G [0, T]. (8.3) 

^ Jo ^ Jo 

We recall that the squared Wasserstein distance is convex w.r.t. linear interpolation of mea- 
sures. Therefore, given two absolutely continuous curves {nD and (//|), the curve t ^ ixt '■= 
+ /if)/2 is absolutely continuous as well and its metric speed can be bounded by 

I ■ 1 I 2 I ■ 2 I 2 

\llt? < for a.e. t G (0,r). (8.4) 

Adding up (8.2) and (8.3) and using the convexity of the Fisher information functional (see 
Lemma 4.10), the convexity of the squared metric speed guaranteed by (8.4) and taking into 
account the strict convexity of Entm we deduce that for the curve t ^ jit '■= {fJ't + Mt)/2 it 
holds 

Entnv(M) > Entmint) + 2 y l/^^P ds + - y f{fs) ds 

for every t such that ji] ^ jil, where ft := \{fl + ft) is the density of ^t- This contradicts 
Lemma 5.15, which yields the opposite inequality. □ 

Although the result will not play a role in the paper, let's see that we can apply the previous 
theorem to characterize all limits of the JKO [22] - Minimizing Movement Scheme (see [4, 
Definition 2.0.6]) generated by the entropy functional in ^y(X). The result shows that 
starting from an initial datum with bounded density, the JKO scheme always converges to 
the L^-gradient flow of Cheeger's energy, without any extra assumption on the space, except 
for the intcgrability condition (4.2). 

For a given initial datum /xq = /oTn G D{Entm.) and a time step h > we consider the 
sequence /^(j = /^tn defined by the recursive variational problem 

/xjje argmin \ ^Wiifi, fi';,_i) +Entrnifi)} , 
and we set /Lt'^(i) = f'^{t)m := /x^ if i G ((n - l)h,nh]. 

Corollary 8.2 (Convergence of the minimizing movement scheme) Let {X,T,d,xn) be 

an extended Polish space satisfying (4.2) and let fiQ = foxn G Z?(Ent^) with /q G L°°{X,m). 
Then for every t > the family f^^{t) weakly converges to fit = /ttn as h \.0, where ft = ^t{fo) 
is the -gradient flow of Cheeger's energy. 
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Proof. Arguing exactly as in [1, §2.1], [31, Proposition 2] it is not hard to show that ||/^||oo < 

ll/o||oo- 

We want to apply the theory developed in [4, Chap. 2-3]: according to the notation therein 
notation, ^ is the metric space endowed with the Wasserstein distance W2, a is 

the weak topology in ^{X), and (f) is the Entropy functional Ent^^. Since by (7.2) and (7.5) 
the negative part of Ent^ has at most quadratic growth in ^^,)(X), the basic assumptions [4, 
2.1(a,b,c)] are satisfied and we can apply the compactness result [4, Corollary 3.3.4]: from any 
vanishing sequence of time steps hm i can extract a subsequence (still denoted by hm) 
such that ^^""{t) = /tTTi weakly in ^{X), with /'*'"(*) ^ ft weakly in any i7'(X,m), 

p E [1,00), and H/tHoo < 1 1 /o 1 1 00 ■ Since the relaxed slope of the entropy functional, defined as 

|5~EntTO|(M) := inf \ liminf \V~Entm\{Hn) ■ Hn ^ fJ-, supW^2(Atn, At),Entm(/Xn) < 00 \ 

still satisfies the lower bound (7.11) |5~Ent^|(pm) > f{p) thanks to the lower semicontinuity 
of the Fisher information with respect to the weak L^(X, m)-topology, the energy inequality 
[4, (3.4.1)] based on De Giorgi's variational interpolation yields 

Ent„(/xo) > Ent„(/XT) + ^ ^ |At|'di+^^ F{ft)dt. 

Applying the previous theorem we conclude that ft = Ht(/o). Since the limit is uniquely 
characterized, all the family iJ,^{t) converges to as /i 4, 0. □ 



8.2 Uniqueness of the Wasserstein gradient flow if |V Entml is an upper 
gradient 

In the next theorem we prove uniqueness of the gradient flow of Ent^, a result that will play 
a key role in the equivalence results of the next section. Here we can avoid the uniform L°° 
bound assumed in Theorem 8.1, but we need to suppose that |V~Entrtx| is an upper gradient 
for the entropy functional (a condition which is ensured by its geodesically iC-convexity, see 
the next section). 

Theorem 8.3 (Uniqueness of the gradient flow of Entm) Let {X,T,6,m) be a Polish ex- 
tended space be such that |V~Entmj is an upper gradient o/Entm and let p G Z?(Entni). Then 
there exists at most one gradient flow o/Entm starting from p in {3^\^^t^{X)^ W2)- 

Proof. As in [16] and in the proof of Theorem 8.1, assume that starting from some p G 
D(Ent tn) we can find two different gradient flows {pt) and {p^)- Then we have 

Ent„(/x) = Ent„^(/x^) + \ I \p\\^ dt + ^ [ |V-Ent„^|2(/xJ) dt VT > 0, 

Jo Jo 

Ent^(/x) = Ent„,(/xi.) + ^ ^ Ijl'^^fdt + ^J^ |V-Ent„,|2(/x|) dt VT > 0. 

Adding up these two equalities and using the convexity of the squared slope guaranteed by 
Theorem 7.8, the convexity of the squared metric speed guaranteed by (8.4) and taking into 
account the strict convexity of Ent^ we deduce that for the curve t t-^ pt := {pi + Pt)/2 it 
holds 

Ent„(//) > Ent„,(MT) + ^ \pt\'^ ^^^2]^ \V'F.ivi^\^{pt) dt, 
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for every T such that ^ jx^. Taking the upper gradient property into account, this 
contradicts (2.24). □ 

Remark 8.4 The proofs of Theorem 8.3 and Theorem 8.1 do not rely on contractivity of the 
Wasserstein distance. Actually, as proved by Ohta and Sturm in [30], the property 

W2{nt,i^t)<e^^W2{tiQ,UQ) 

for gradient flows of Entm in Minkowski spaces (M", || • ||,^") whose norm is not induced by 
an inner product fails for any G M. ■ 

8.3 Identification of the two gradient flows 

Here we prove one of the main results of this paper, namely the identification of the gradient 
flow of Ch* in L^(X, m) and the gradient flow of Ent^^ in (^(X), ^2)- The strategy consists 
in considering a gradient flow {ft) of Ch^, with nonnegative initial data and in proving that 
the curve t ^ fit '■= ft^ is a gradient flow of Entm in {^{X),W2)- All these results will be 
applied to the case of metric spaces satisfying a CD{K, 00) condition in the next section. 

Theorem 8.5 (Identification of the two gradient flows) Let (X, r, d,m) be a Polish ex- 
tended space such that (4.2) holds and let us assume that |V~Entm| is lower semicontinuous 
with respect to strong convergence with moments in ^{X) on sublevels of Entm- For all 
fo G L'^{X,m) such that /xq = fo^ G ^v{^) the following equivalence holds: 

(i) If ft is the gradient flow o/Ch* in L^(X, m) starting from fo, then fit '■= ft^ is the gradi- 
ent flow o/Entm in (,^[^p](X), 1^2) starting from fiQ, 1 1->- Entm(Att) is locally absolutely 
continuous in (0, 00) and 

-^Ent„(/xt) = lAtP = |V-Entn^(/xt)|2 for a.e. t € (0,oo). (8.5) 

(ii) Conversely, if |V~Entnv| is an upper gradient of Entm, o,nd jit is the gradient flow of 
Entm in (^[^g](X), W2) starting from /qTU, then fit = ft^ 0,'f'd ft is the gradient flow of 
Ch* in L^(X, m) starting from fo. 

Proof, (i) First of all, we remark that assumption (6.2) of Lemma 6.1 is satisfied, thanks to 
Theorem 4.20; in addition, the same theorem ensures that V^ff dm < 00 for aU t > 0. 
Defining fit := ftxn, we know by Proposition 4.22 that the map t 1-^ Entn,(//j) is locally 
absolutely continuous in (0, 00) and that (4.54) holds. 

On the other hand, since we assumed the lower semicontinuity of |V~Entni|, we can prove 
that Entm{fLt) satisfies the energy dissipation inequality (2.25). Indeed, by Lemma 6.1 and 
Theorem 7.6 it holds: 



I V f I 1 1 

dm > -|/itp + -lV-Entnvp(^t) for a.e. t G (0,oo). 



{ft>o} ft 2 "2 

This proves that Ent„v(jUt) satisfies the energy dissipation inequality. But, since wc know 
that t I— 7- Entni(;U() is locally absolutely continuous we can apply Remark 2.6 to obtain that 
|^EntTO(//t)| < |V~EntTO|(//t)|/it| for a.e. t G (0, 00). Hence, as explained in Section 2.5, 



63 



(2.25) in combination with Young inequality and the previous inequahty yield that all the 

inequalities turn a.e. into equalities, so that (8.5) holds. 

(ii) We know that a gradient flow ft of Ch^ starting from /q exists, and part (i) gives that 
At ■= ft^ is a gradient flow of Entm- By Theorem 8.3, there is at most one gradient flow 
starting from /xq, hence fit = f^t for all t >Q. □ 

As a consequence of the identification result, we present a general existence result of the 
Wasserstein gradient flow of Entnx which includes also the case of cr-finite measures. 

Theorem 8.6 (Existence of the gradient flow of Entm) Let {X,T,d,m) be a Polish ex- 
tended space satisfying assumption (4.2) and such that |V~Entm| is lower semicontinuous 
with respect to strong convergence with moments in 3^{X) on sublevels o/Entm. 
Then for all n = pm G DiEntm) there exists a gradient flow of Entm starting from ji in 

(^[^](X),W2). 

Proof. For completeness we provide a proof that does not use the identification of gradient 
flows in the case when |V~Entm| is also an upper gradient for the entropy functional: indeed, 
we can apply the existence result [4, Prop. 2.2.3, Thm. 2.3.3], achieved via the so-called 
minimizing movements technique, with the topology of weak convergence in duality with 
Cb{X). Remark 7.3, (7.2), and the lower semicontinuity part of Theorem 7.6 give that the 
assumptions are satisfied, and we get measures fit satisfying 

Entm(/9m) = Ent„,(^() + / + i| V-EntmP(Ms) ds \ft > 0. (8.6) 

In the general case, we can take advantage of the identification of gradient flows and immedi- 
ately obtain existence when p G {X, m) . If only the integrability conditions p log p dm < 
oo and V'^p dm < oo are available, we can use the same monotone approximation argument 
as in the proof of Theorem 7.4: keeping that notation, we set := /O^m, and we decompose 
the function plogp into the sum h^(p) + hj^{p) where h^{p) = min(/9, c~^) log(min(p, e~^)) 
and /i+(p) = max(/9, e^^) log(max(p, e~^)) -f- e~^ are decreasing and increasing functions re- 
spectively. Applying the monotone convergence theorem to /i±(p") we easily get Entn^(/Lt") 
Entm(Att) as n — >■ oo. By the lower semicontinuity of the Fisher information we can pass to 
the limit in the integral form of (4.54), written for p", obtaining 

rti 

|Ent^^(/i^J - Entni(Att J| < / F(/9^) ds for every < to < 

•Ito 

It follows that 1 1— >■ Entm(^t) is absolutely continuous. We can now pass to the limit in (8.6) 
written for /i" by using the lower semicontinuity of |V~Entm| and of the 2-energy to obtain a 
curve pt satisfying the entropy-dissipation inequality (2.25). Since 1 1— )■ Entm(Att) is absolutely 
continuous, (2.25) yields (2.26) and we conclude that pt is a gradient flow of Entm- □ 

9 Metric measure spaces satisfying C D{K, oo) 

In this section we present the applications of the previous theory in the case when the Polish 
extended space (X, r, d, m) has Ricci curvature bounded from below, according to [25] and [35]. 
Under this condition the Wasserstein slope |V~Entm| turns out to be a lower semicontinuous 
upper gradient of the entropy, so that all the assumptions of Theorems 8.3, 8.5, and 8.6 are 
satisfied. 
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Definition 9.1 {CD{K, oo)) We say that {X, r, d, m) has Ricci curvature bounded from below 
by K £ M i/ Entm is K -convex along geodesies in {^{X),W2)- More precisely, this means 
that for any hq, /ii G D(Ent^) C ^{X) with VF2(/uo,/Ui) < oo there exists a constant speed 
geodesic fit '■ [0, 1] — )■ ^^{X) between fxo and /xi satisfying 

Ent,n(/xt) < (1 - t)Entra{m) + iEnW(Aii) - yt(l - t)W|(/xo, /xi) Vi G [0, 1]. (9.1) 

Notice that unlike the definitions given in [25] and [35], here we are allowing the distance d 
to attain the value +oo. Also, even if d were finite, this definition slightly differs from the 
standard one, as typically geodesic convexity is required only in the space {,0^2iX), W2), while 
here we are assuming it to hold for any couple of probability measures with finite entropy and 
distance. Actually, the two are equivalent, as a simple approximation argument based on the 
tightness given by Remark 7.3 shows. 

Remark 9.2 (The integrability condition (4.2)) If (X, r, dm) satisfies a,CD{K,oo) con- 
dition and r is the topology induced by the finite distance d, then (4.2) is equivalent (see [35, 
Theorem 4.24]) to assume that for every x £ X there exists r > such that m{Br{x)) < 00. 
In this case one can choose V{x) := Ad{x, xq) for a suitable constant ^4 > and xq £ X. ■ 

Theorem 9.3 (Slope, Fisher, and gradient flows) Let (X, r, d,m) be a Polish extended 
space satisfying CD{K, 00) and (4.2). 

(i) For every fi = fm G .!^y{X) the Wasserstein slope |V~EntniP(^) coincides with the 
Fisher information of f, it is lower semicontinuous under setwise convergence, according 
to (7.22), and it is an upper gradient for Entm- 

(ii) For every /xq = /o^ £ -D(Entnx) there exists a unique gradient flow jjLt = ftm. 0/ Ent,n 
starting from jiQ in (=^[^](X), W2). 

(Hi) If moreover fg G L'^(X,m), the gradient flow ft = H(/o) of Ch* in L^(X, m) starting 
from fo and the gradient flow nt o/Entm in ( (X), VF2) starting from /iq coincide, 
i.e. fit = ftxn for every t > 0. 

Thanks to this theorem, under the CD{K, 00) assumption we can unambiguously say that 
a Heat Flow on (X, r, d,m) is either a gradient flow of Cheeger's energy in L^{X,m) or a 
gradient flow of the relative entropy in (^(X),W2), at least for square integrable initial 
conditions with finite moment. 

Concerning the proof of Theorem 9.3, we observe that applying the results of the previous 
section it is sufficient to show that the Wasserstein slope |V~Entni| is lower semicontinuous 
w.r.t. strong convergence with moments in ^(X) on the sublevel of Entm. In fact, if this 
property holds, (7.21) of Theorem 7.6 shows that |V^Ent^| coincides with the Fisher func- 
tional and thus satisfies the lower semicontinuity property (7.22): in particular it is lower 
semicontinuous w.r.t. weak convergence with moments in ^(X). Applying Theorem 8.6 we 
prove the existence of the Wasserstein gradient fiow starting from /iq; its uniqueness follows 
from Theorem 8.3, since the slope is always an upper gradient of Entm under CD{K, 00). 
Applying Theorem 8.5 we can thus obtain the identification of the two gradient flows. 

In order to prove the lower semicontinuity of the slope |V~Entm| w.r.t. strong convergence 
with moments in ^{X) (Proposition 9.7) we proceed in various steps, adapting the arguments 
of [16]. 
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Definition 9.4 (Plans with bounded deformation) Let m := e m, where V satisfies 
(4.2). We say that 7 G ^{X'^) has bounded deformation if 

deL^{XxX,'y) and cm < tt^j < ^m, i = 1, 2, for some c> 0. (9.2) 

Proposition 9.5 (Sequential lower semicontinuity of G~^) For any plan 7 with bounded 

deformation the map fi 1— t- G-y{p,) = Entn^(/i) — Entni(7j/u) (recall Section 7.3) is sequentially 
lower semicontinuous with respect to weak convergence with moments, on sequences with Ent^ 
uniformly bounded from above. 

Proof. Let fin = Vn^ € J^y(X) be weakly convergent with moments to fi = 77m, with 
EntOT(/in) uniformly bounded. If p denotes the density of irj-y w.r.t. m, we have that {rfn/p) o 
7r^7 is an admissible plan between ^„ and 7j/U„, hence G L°°{X, tn) and d G L°°{XxX,'y) 
ensure that 7j/^n 

belong to ^v'(X) as well. A similar argument also shows that 7(j/x„ converge 
with moments to 7jj//. From (7.5) we obtain that 

Entni(/X„) - Entni(7(j/Xn) = EntraiPn) - EntA(7j/i„) -J V'^dpn + J d7(|/Xn 

and that Entfl^(^„) are uniformly bounded. So, we are basically led, after a normalization, to 
the case of a probability reference measure m. In this case the proof uses the equiintegrability 
in L^{m) of rjn, ensured by the upper bound on entropy, see [16, Proposition 11] for details. 

□ 



Lemma 9.6 (Approximation) If fi, u e il'(Entm) satisfy W2{n,i') < 00 then there exist 
plans 7„ with bounded deformation satisfying 

/ d^d{^n)n^Wi{n,u) and Entnv((7jj/x) ^ Entnv(z/). 

JXxX 

Proof. Let p, rj be respectively the densities of /i and v w.r.t. rh, let 7 be an optimal plan 
relative to p and u and set 7„ := z'^Xe^^, where 

En := {{x, y) : p{x) + d(x, y) + ri{y) < n} , Zn := 7(^n) t 1- 

By monotone convergence it is immediate to check that 7„ satisfy the first convergence 
property, because Zn^^^ 'I 1- In connection with the second one, since Zn{"^n)^ t (t)m = 
considering the second marginals 2;„(7„)jj/i := ?7„m we see that r]n'l rj m-a.e. in X. 

It is clear that d G L°°(X x X,^^) and that the marginals of 7^ have bounded densities 
with respect to m. Considering convex combinations 7„ := (1 — l/n)7„ + 7°/n, with 7° := 
(Id, /(f)jjfh, we obtain that the densities arc bounded also from below, so that 7^ have bounded 
deformation; in addition, since 'y^p = p, we obtain 

(7n)fl/^= (l~~)^n^»7n + ^M, 

so that monotone convergence (see the argument in the proof of Theorem 8.6), convexity and 
lower semicontinuity of the entropy ensure that still Entnv((7„)tiAt) Enttn(^'). □ 
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Proposition 9.7 (|V~Entm| is a l.s.c. slope in CD{K, oo) spaces) Assume that {X,T,d,m) 
is a Polish extended space satisfying CD{K, oo) and (4.2) holds. Then L'(Entm) 9 /U i— t- 
|V~Entn;p(//) is sequentially lower semicontinuous w.r.t. weak convergence with moments on 
the sublevels o/Entm- In particular (7.21) holds. 

Proof. In this proof we denote by C{'y) the cost of 7, i.e. C{'y) := / d7. We closely follow 
[16, Theorem 12 and Corollary 13]. 

Let /U = pm in the domain of the entropy. Taking (2.22) and the -fC-geodesic convexity of 
Entnv into account, we first prove that it holds 

|V Entn^|(/x) = sup , , (9.3) 

where the supremum runs in the class of plans 7 with bounded deformation. Indeed, inequality 
> follows choosing u = 7(j/x in (2.22) and using the trivial inequality 

, ^ ia~b)+ (a~c) + 
aeR, c>b>0 ^ ^ ^ > ^ ^ 



\/b \/c 

with a = Gry(fi), b = VF|(/Lt, u) and c = C(7^), together with the fact that C(7^) > W|(//, i^). 
The other inequality is a consequence of the approximation Lemma 9.6. 

To conchidc, it is sufficient to prove that for all 7 with bounded deformation the map 
I— 7- (G-y{pL) — '^C(7^))^/C(7^)"'^/^ is sequentially lower semicontinuous with respect to 
weak convergence with moments on the sublevels of the entropy. This follow by Proposition 9.5 
and the fact that /x C{'y^) is continuous along these sequences. In turn, the continuity 
property along these sequences follows by the representation 

Indeed, both (d7rp7/dm) ^ and d are essentially bounded, while the densities d/i/dm are 
equiintegrable, as we saw in the proof of Proposition 9.5. □ 



10 A metric Brenier theorem and gradients of Kantorovich 
potentials 

In this section we provide a "metric" version of Brenier's theorem and we identify ascending 
slope and minimal weak upper gradient of Kantorovich potentials. These results depend on 
L°° upper bound on interpolations, a property that holds in spaces with Riemannian lower 
bounds on Ricci curvature, see [3], or in nonbranching CD{K, 00) metric spaces (because 
the nonbranching property is inherited by {,'^{X),W2), see [38, Corollary 7.32], [2, Propo- 
sition 2.16], and all p-entropies are convex). If d is bounded, the L°° bound can be relaxed 
to an easier bound on entropy, but modifying the class of test plans, see Remark 10.7. We 
assume throughout this section that 

d is a finite distance and m satisfies (4.2). 

However, we keep the possibility of considering the case when r is not induced by d. 
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In this section we denote by T the class of test plans concentrated on AC^([0, 1]; (X, d)) 
with bounded compression on the sublevcls of V and by S C T the subclass of test plans 
concentrated on Geo{X). By Remark 5.11 we have the obvious relation 

|V/U,g < |V/U,T. (10.1) 

In the next lemma we prove that, for Kantorovich potentials (p, t i— t- ip{'^t) is not only 
Sobolev but also absolutely continuous along T-almost all curves in AC^([0,l];(X,d)). This 
holds even though in our general framework no Lipschitz continuity property (not even a local 
one) of if can be hoped for; in particular, by Remark 2.6 we obtain that |V"^(^| is a T-weak 
upper gradient of ip. 

Lemma 10.1 (Slope is a weak upper gradient for Kantorovich potentials) Let ix = 

pm e 3^{X), V G S^{X) with W2{p,v) < oo and let ip : X ^ RU {-co} be a Kantorovich 
potential relative to some optimal plan 7 between p and u. If p satisfies 

P>cm>0 m-a.e. in {V < M], for all M > (10.2) 

then (f is absolutely continuous along l-almost every curve o/AC^([0, 1]; {X,d)) and the slope 
|V^(^| is a 7-weak upper gradient of ip. 

Proof. Set / = — </?'^, so that (p = Qif (here we adopt the notation of §3) and the set 2)(/) in 
(3.3) coincides with X. By Proposition 3.9 we know that the function 



D\x) := [ d{x,y)d'r, 
Jx 



(y) 



(where {'yx}x€X is the disintegration of 7 w.r.t. /x) belongs to L'^{X, p) and bounds /Lt-a.e. from 

above D~{x, 1) by (3.23), and then m-a.e.; we know also from (3.13b) that |V+99| G L'^{X,p) 
and that D~{x,l) > \V^(p\{x) wherever (p{x) > —00. We modify D* in a m-negligible set, 
getting a function D G L'^{X,p) larger than D^{x, 1) everywhere and equal to +00 on the 
m-negligible set {(^ = — 00}. 

We claim now that the condition J < 00 is fulfilled for T-almost every 7 in AC^([0, 1]; {X, d)). 
Indeed, arguing as in (5.11), for any test plan tt G T with 8.2[l] < < 00 7r-a.e. we have 

/ / L>d7r<iv(c(7r,M) /" & Axn]^'^ < n(c-^C{tz,M) ! b'^dpf'^ <oo, 

J j7n{y<M} ^ J{v<M} ^ ^ Jx ' 

thanks to the fact that p > cm on {y < M}. Since M is arbitrary and since TT-a.e. curve 7 
is contained in {V < M} for sufficiently large M, the claim follows. 

Now, let 7 G AC^([0, l];(X,d)) with J D < 00, and A = |7|jSfi| . ip is d-upper 

semicontinuous and o 7 is finite A-a.e. (since D o 7 is finite A-a.e.). By (3.14) with x = 7s 
and y = 7^, taking also the inequality D~{x, 1) < D(x) into account, we get 

<^(7s)-^(7t)<d(7.,7t)(^(7t) + ^^^|^) < t\ir\dr (^(7*) + diam(7)) 

J S 

for all t such that (p{'yt) > —00. Hence we can apply Corollary 2.8 to conclude that o 7 is 
absolutely continuous in [0, 1]. Recalling Remark 2.6 we get 



[ ip < [ \V+ip\. 



7 

□ 
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Using (10.1), the previous lemma and Proposition 3.9, we have the chain of inequaUties 



\Vip\^c^{x) < |Vv9U,t(x) < \V+ip\{x) < (iix,y) 7-a.e. in X x X (10.3) 

for any optimal plan 7. In the next theorem we show that an L°° bound on geodesic inter- 
polation ensures that the inequalities are actually equalities. 

To perform geodesic interpolation we will assume that {X,d) is a geodesic space. In 
such spaces, the optimal transport problem can be "lifted" to Geo{X) considering all tt G 
^(Geo(X)) whose marginals at time and at time 1 are respectively fi and v and minimizing 



J J I7spdsd7r(7) = y d^(7o,7i)d7r(7) 



in this class. Since (eo,ei)jj7r is an admissible plan between f^ and v, it turns out that 
the infimum is larger than Ty|(;U,z/). But a simple measurable geodesic selection argument 
provides equivalence of the problems and existence of optimal tt. This motivates the next 
definition. 

Definition 10.2 (Optimal geodesic plans) Let ^, G ^{X) be such that W2{lJ-,i') < 00. 
A plan TV G ^{Geo{X)) is an optimal geodesic plan between ji and v if 

(eo)tt7r = /x, (ei)t|7r = z/, J 6'^{'jo,li)d'K{'j) = J dsd^i-j) = W^{iJ.,u). 

It is easy to check that 

t ^ (et)p7r, (10.4) 

is a constant speed geodesic in ^(X) from ^ io v for all optimal geodesic plans between jx 
and v. In particular, {^{X),W2) is geodesic as well. Also, (eo,ei)(j7r is an optimal coupling 
whenever tt is an optimal geodesic plan. 

Adapting the arguments in [38, Theorem 7.21, Corollary 7.22] for the locally compact case 
and [23, 2] for the complete case, it can be shown that in any geodesic extended Polish space 
{X,T,d) (10.4) provides a description of all constant speed geodesies, see [24]. 

Theorem 10.3 (A metric Brenier's theorem) Let fi = pm G ^{X) be satisfying (10.2), 
let u G d^{X) with W2{lJ-,i') < 00, let tt be an optimal geodesic plan between p and v 
and let if : X ^ {—00} be a Kantorovich potential relative to (eo,ei)(|7r. Assume that 
(es)[|7r = fis = Ps^ for all s > sufficiently small and that 

limsup ||ps||L°°({y<M},m) < 00 VM > 0. (10.5) 

Then 

d(7i,7o) = |V+¥'|(7o) = |V</?U,t(7o) = |V<^U,g(7o) for Tv-a.e. 7 G Geo{X). (10.6) 

As a consequence, W2{p,i') = Jx |V"'"y'pd/x and \V'^ip\ = iViplw^j = |V(^|^^g m-a.e. in X. 
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Proof. Set g := \V(p\^^^ G L'^{X,m). Taking (10.3) into account, (10.6) can be achieved if 
we show that / d^(7i, 70) d7r < J 5^^(70) d7r. Setting / = —tp*^ so that tp = Qif, for 7r-a.e. 
7 G Geo(X) we have 

d^(7o,7i)\ f.^ . , d2(7t,7i) 



<p{lo) - <^(7t) > (/(71) + ^^1^) - (/(71) + 



^ *^-d2(7o,7i)=f^H'(7o,7i)- (10.7) 



2 

Since the speed of 7 is d(7o,7i) we have 

[filo) - ^{it)) < |V</'U,g(7s)d(7i,7o)ds) < td'^iluJo) g'^ils) <is. 

Set now Zm ■= {7 G Geo(X) : 1/(70) < M, d(7o,7i) < M}. Dividing by fd'^{ji,jo) = 
d^(7t,7o) and integrating on Zm with respect to tt, we obtain 

tJoJzM Jzm^ d(7o,7t) ^ 4 

Setting L = Lip(y), for all S > Lt our choice of Zm gives for fig = (es)(fT 

i /7 d^.d. > / > ^ / d^ d,. (10.8) 

i Jo J{y<Af+<5} J^M ^ d(7o,7t) / 4 Jz^ 

In order to pass to the limit as t 4- Oj we observe that (10.5) gives 

VAT > : / fdfis^f /d/x as s I for all fX{v<N} e L^{X,m). (10.9) 

J{V<N} J{V<N} 

Indeed for every bounded, Borel, and d-Lipschitz function /i : X — )• M we have 

/ hdf^s - [ hdii < [ |/i(7,)-/i(7o)|d7r(7) < sLip(/i)W^2(M,z^)- (10-10) 
Jx Jx J 

On the other hand, arguing exactly as in the proof of Proposition 4.1, if fX^v<N} ^ L^{X,m) 
we can find a sequence (/i„) C {X, m) of bounded, Borel, d-Lipschitz functions strongly con- 
verging to fX^v<N} L^{X,m). Upon multiplying hn by the d-Lipschitz function k^ix) := 
min{l, (A^ -|- 1 — V{x))^}, it is not restrictive to assume that hn identically vanishes on 
{V > A'' -|- 1}. If ||ps||Loo({v<Af+i},m) ^ for sufficiently small s according to (10.5), we thus 
have 



/ fdfJ-s- /d/X < 2C||/X/y<jv} - /ln||Li(X,m) + / K^-Hs - I 

J{V<N) J{V<N} JX JX 



hn d/j, 



Taking first the limsup as s J, thanks to (10.10) and then the limit as n ^ 00 we obtain 
(10.9). 

By (10.2) the functions g'^X[v<M+S} belong to L^{X,m). Therefore, using (10.9) with 
f := g and N := M + 6, passing to the limit in (10.8) first as 1 1 and then as | gives 



d/x> limsup/ (^^i^L^M)'d7r(7)> / d2(7i, 70) d7r(7). (10.11) 

l{V<M} 40 JZm ^ °\10:lt) ' JZm 

Letting M ^ 00 this completes the proof of (10.6). □ 
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The identification (10.6) could be compared to Theorem 6.1 of [10], where Cheeger iden- 
tified the relaxed gradient of Lipschitz functions with the local Lipschitz constant, assuming 
that the metric measure space (X, d,m) is doubling and satisfies the Poincare inequality. 
Without doubling conditions, but assuming the validity of good interpolation properties, we 
are able to obtain an analogous identification at least in a suitable class of c-concave functions. 

For finite reference measures m and densities p uniformly bounded from below, we can 
also prove a more precise convergence result for the difference quotients of ip. 

Theorem 10.4 Let jjl = pm G ^{X) be satisfying p> c> m-a.e. in X and let (p, tt as in 
Theorem 10.3. Then 

1-^ y^(7o) - ^(7^ = |VVl(7o) mL^ (Geo(X),7r). (10.12) 

40 d(7o,7t) 

Proof. The lower bound on p yields in this case lV+(/9| G L^(X, m), hence one can argue as in 
the proof of Theorem 10.3, this time integrating on the whole of Geo(X), to get 

/ |V<gd/x>limsup / (^(^^hL^)'d7r(7)> / d2(7i, 7o) d7r(7) 

Jx 40 JGeoiX) ^ a(70,7t) ^ JGeo(X) 

in place of (10.11). Since (10.6) yields that all inequalities are equalities, and (10.7) yields 

liminf '^(^q) ~ > |V+¥?|(7o) for TT-a.e. 7 G GeoiX) 

40 d(7o,7t) - I ^1^' ^ 

we can use Lemma 10.5 below to obtain (10.12). □ 



Lemma 10.5 Let a he a positive, finite measure in a measurable space {Z, 3^) and let fn, f G 

L'^{Z,S',a) be satisfying 

limsup / fn<ia< [ da < 00 (10.13) 
and liminf „ fn^ f ^ ^ a-a.e. in Z. Then fn—^fin 9^, o"). 

Proof. If /„ > 0, it suffices to expand the square (/„ — /)^ and to apply Fatou's lemma. In 
the general case we obtain first the convergence of f^ to / in L^, and then use (10.13) once 
more to obtain that — t- in L^. □ 

Example 3.10 shows that the localization technique provided by the potential V and (4.2) 
plays an important role: indeed, in the same situation of that example, let m = (^o + x~^^^ 
be a cr-finite measure in X = [0,1], so that dpt/dm{x) < 1 for any t, x. In this case the 
conclusions of the metric Brenier theorem are not valid, since /iq is concentrated at and 
d{0,y) takes all values in [0, 1]. Notice that m is not locally finite and the class of continuous 
m-integrable functions is not dense in L^([0, l];m) (any continuous and integrable function 
must vanish sA, x = 0). 

Remark 10.6 We remark that in the generality we are working with, it is not possible to 
prove uniqueness of the optimal plan, and the fact that it is induced by a map, not even if we 
add a CD{K, 00) assumption. To see why, consider the following example. Let X = with 
the L°° distance and the Lebesgue measure. Let /xq := X[o,i]2^^ and pi := X[3^4]x[o,i]-^^- 
Then, using standard tools of optimal transport theory, one can see that the only information 
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that one can get by analyzing the c-superdifferential of an optimal Kantorovich potential is, 
shortly said, that any vertical line {t} x [0, 1] must be sent onto the vertical line {t + 3} x [0, 1]. 
The constraint on the marginals gives that this transport of {t} x [0, 1] on {t + 3} x [0, 1] 
must send the 1-dimensional Hausdorff measure on {t} x [0, 1] in the 1-dimensional Hausdorff 
measure on + 3} x [0, 1] for a.e. t. Apart from this, there is no other constraint, so we see 
that there are quite many optimal plans and that most of them are not induced by a map. 
Yet, the metric Brenier theorem is true, as the distance each point travels is independent of 
the optimal plan chosen (and equal to 3 for fio-a.e. x). ■ 

Remark 10.7 Theorem 10.3 and Theorem 10.4, with the same proof, hold if we replace 
condition (10.5) with the weaker one (at least in finite measure spaces) 

lim sup / ps log ps dm < oo, 

s^O Jx 

but adding the condition iV+f^l G L°°{{V < M},m) for all M > 0. This, however, requires a 
slight modification of the class of test plans, and consequently of the concept of minimal weak 
upper gradient, requiring that the marginals have only bounded entropy instead of bounded 
density. This approach, that we do not pursue here, might be particularly appropriate when d 
is a bounded distance (e.g. in compact metric spaces), because in this situation Kantorovich 
potentials are Lipschitz. ■ 
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